Designs and develops techniques for compressing Large Language Models, builds LLM-based applications, and optimizes models for performance.