- 📚 Aurélien Géron. Hands on Machine Learning with Scikit-Learn, Keras and TensorFlow. [Link]
- 📚 Chris Fregly, Antje Barth. Data Science on AWS. [Link]
- 📚 Hannes Hapke, Catherine Nelson. Building Machine Learning Pipelines. [Link]
- 📚 François Chollet. Deep Learning with Python [Link]
- 📚 Mariano Anaya. Clean Code in Python [Link]
- 🤜 Dataquest Academic Program [Link]
- 😃 CS329S - ML Systems Design [Link]
Week 01 - Warming up
- Git and Version Control
- Elements of the Command Line
- Text Processing in the Command Line
Week 02 - Introduction to Data Science [Slide]
- Pandas and Numpy Fundamentals
Week 03 - Clean Code Principles for Data Science and Machine Learning [Slides]
- Outline [Video]
- Coding Best Practices [Video]
- Writing Clean Code [Video]
- Refactoring Code [Video]
- Efficient Code [Video]
- Documentation [Video]
- Python Code Quality Authority (PCQA) - pycodestyle [Video]
- PCQA - pylint [Video]
- PCQA - autopep8 [Video]
- PCQA - nbQA [Video]
▶️ Hands on- 💾 Datasets [Link]
Writting Clean Code
Exercise 01
Exercise 02
Exercise 03
Using pycodestyle
- 🐍 using pylint - [script] [refactored script]
- 📝 Best practices for writing functions [Link]
Week 04 Production Ready Code [Slide]
- Outline [Video]
- Catching Errors [Video]
- Testing and Data Science [Video]
- A brief introduction about pytest [Video]
- Logging [Video]
- Case study: testing and logging [Video]
- Model Drift [Video]
- Hands on
Production ready code
- Data Visualization Fundamentals [Link]
- Storytelling Data Visualization and Information Design [Link]
Week 05 Building a Reproducible Model Workflow [Slide]
- Outline [Video]
- Business Reflections [Video]
- Introduction to MLOps [Video]
- A brief history of MLOps and Tools [Video]
- Tools and environment installation [Video]
README
- Tools and environment installation cont. [Video] environment.yml
- Machine Learning Pipelines [Video]
- Machine Learning Pipelines - Command Line Interface [Video] my_script.py
- Versioning Data and Artifacts [Video]
Upload and version artifacts
- Guided Exercise - CLI + Weights and Biases [Video] Guided Exercise 01
- MLflow Projects [Video]
- Introduction to YAML [Video]
YAML Intro.
- Guided Exercise - Build a MLflow component [Video] Guided Exercise 02
- Linking together the components MLflow + Hydra [Video]
- Guided Exercise - Multiples MLflow components + Hydra [Video] Guided Exercise 03
- Additional Material
Week 06 Building a Reproducible Model Workflow Cont. - Introduction to Machine Learning [Slide]
- Outline [Video]
- What is Machine Learning? [Video]
- Machine Learning Types [Video]
- Variables, Pipeline, Controlling Chaos [Video]
- Data Segregation - train, dev and test sets [Video]
- Bias vs Variance [Video]
- Optional hands on Dataquest.io
- Machine Learning Fundamentals [Link]
Week 07 Building a Reproducible Model Workflow Cont. - ETL, Data Checks, Data Segregation [Slide]
- Outline [Video]
- Extract, Transform, Load (ETL)
- Exploratory Data Analysis (EDA) Video [Part I] [Part II] [Part III] [Part IV] [Part V] [Source-Code]
- Preprocessing [Video] [Source-Code]
- Data Segregation [Video] [Source-Code]
- Data Checks
- Data Validation [Video [Source-Code]
- Deterministic Test [Video] [Source-Code]
- Non-Deterministic Test [Video] [Source-Code]
- Multiple Hypothesis Testing [Video] [Source-Code]
- Multiple Hypothesis Testing Using MLFlow [Video] [Source-Code]
- Multiple Hypothesis Testing Using Parameters in PyTest [Video] [Source-Code]
Week 08 Building a Reproducible Model Workflow Cont. - Train, Validation and Experiment Tracking [Slides]
- Outline [Video]
- A brief review [Video]
- Decision Trees
- Evaluation Metrics
- Implementing Pipelines
- MLOps Level 0 with Pipeline incorporating train [Part I] [Part II] [Part III] [Source-Code]
- MLOps Level 0 with Pipeline incorporating train and preprocessing [Part I] [Part II] [Part III] [Source-Code]
- MLOps Level 1 with Pipeline incorporating train and preprocessing [Part I] [Part II] [Source-Code]
- MLOps Level 1 with Pipeline and Hyper-parameter Tuning [Part I] [Part II] [Part III] [Part IV] [Source-Code]
- Test evaluation [Part I] [Source-Code]
Week 09 Building a Reproducible Model Workflow Cont. - Final Pipeline, Release and Deploy [Slides]
- Outline [Video]
- Final Pipeline
- Big picture of the final pipeline [Video]
- All together [Part I] [Part II] [Source-Code]
- Release for reproducibility
- Create a GitHub repository for the final pipeline [Video] [Source-Code]
- Semantic versioning and remote execution [Video] [Source-Code]
- Deployment
- Deploy with MLflow [Video] [Source-Code]