I ll be at MLSys 2025 @ Santa Clara May 12-15.
-
🔭 I’m currently working on multi-modal machine learning, model compression, efficiency and dimensionality reduction techniques.
-
🌱 I’m learning programming Hetergenous Parallel Systems, focusing on CUDA and Triton for efficient machine learning implementations.
-
🤔 I’m open to help with model distillation, quantization and efficient training and deployment of large ML models.
-
- Developing and optimizing multi-modal model architectures with scalable data input processing to enhance performance in distributed environments.
- Streamlining model training, fine-tuning, and inference processes for seamless integration with cloud and edge computing interfaces.
- Implementing hardware- and IO-aware ML model compression techniques to optimize inference efficiency and reduce computational overhead.