Senior Machine Learning Engineer @ Sapia.ai based in Melbourne, VIC. I specialize in building and deploying scalable, high-performance AI systems, with a focus on Large Language Models (LLMs), MLOps, and cloud-native infrastructure (AWS & GCP).
π M.Sc. Information Technology from The University of New South Wales (UNSW).
π‘ Passionate about leveraging cutting-edge AI for real-world impact, optimizing model inference, and architecting robust data pipelines.
ML & AI:
- LLMs & Generative AI
- PyTorch, TensorFlow, Scikit-learn
- Model Optimization: TensorRT, NVIDIA Triton Inference Server, DistilBERT
- Deep Learning, Predictive Text Analysis
MLOps & Infrastructure:
- CI/CD: Terraform, GitHub Actions, Buildkite
- Containerization & Orchestration: Docker, Kubernetes
- Cloud Platforms: AWS (Sagemaker, Glue, S3, EC2, etc.), GCP (BigQuery, GKE, etc.)
- Monitoring & Logging
Data Engineering & Big Data:
- Data Pipelines: Airflow, Kafka, NiFi, Spark, AWS Glue
- Databases: PostgreSQL, MongoDB, Redshift, Elasticsearch, BigQuery, Iceberg Data Lake
Languages & Tools:
- Python, Scala, R, TypeScript, C
- FastAPI, Flask, Django, React Remix
- Git, Jupyter, DBT
- Real-Time AI Content Detection System: Built a system using TensorRT-LLM & NVIDIA Triton, achieving <200ms inference. Presented at NVIDIA GTC 2024 (P61688).
- Phai Chatbot Career Coach POC: Developed an LLM-powered chatbot with React Remix front-end and FastAPI back-end for personalized career guidance.
- Generative AI for Interviews (InterviewGPT & SAIGE): Implemented and enhanced generative models on AWS Sagemaker, improving evaluation accuracy and fairness in AI-driven interviews.
- Scalable Data Infrastructure: Transformed data processing by scaling a feature database ( billions records) using AWS Glue, Iceberg, and S3, streamlining model training.
- Infrastructure-as-Code for ML: Spearheaded Terraform adoption for ML model deployment, enhancing reliability and scalability.
- (Past) Content Moderation & Spam Detection @ Viettel: Deployed deep learning models for content moderation and automated SMS spam detection.
- NVIDIA GTC 2024: "Real-Time Model-Agnostic Generated Answer Detection in Text-Based Chat Interview [P61688]"
- Connect on LinkedIn
- Email me at leo.ngoc.pham@gmail.com