Designs and develops techniques to compress Large Language Models, builds LLM-based applications, and optimizes models for performance.