8000 GitHub - Carlosrnes/mlops: Repository for DCA0305, an undergraduate course about Machine Learning Workflows and Pipelines
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
forked from ivanovitchm/mlops

Repository for DCA0305, an undergraduate course about Machine Learning Workflows and Pipelines

License

Notifications You must be signed in to change notification settings

Carlosrnes/mlops

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

91 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Federal University of Rio Grande do Norte

Technology Center

Department of Computer Engineering and Automation

DCA0305 - Machine Learning Based Systems Design

References

  • 📚 Aurélien Géron. Hands on Machine Learning with Scikit-Learn, Keras and TensorFlow. [Link]
  • 📚 Chris Fregly, Antje Barth. Data Science on AWS. [Link]
  • 📚 Hannes Hapke, Catherine Nelson. Building Machine Learning Pipelines. [Link]
  • 📚 François Chollet. Deep Learning with Python [Link]
  • 📚 Mariano Anaya. Clean Code in Python [Link]
  • 🤜 Dataquest Academic Program [Link]
  • 😃 CS329S - ML Systems Design [Link]

Lessons

Week 01 - Warming up

  • Git and Version Control
  • Elements of the Command Line
    • Introduction to the Command Line [Link]
    • The Filesystem [Link]
    • Modifying the Filesystem [Link]
    • Glob Patterns and Wildcards [Link]
    • Users and Permissions [Link]
  • Text Processing in the Command Line
    • Getting Help and Reading Documentation [Link]
    • File Inspection [Link]
    • Text Processing [Link]
    • Redirection and Pipelines [Link]
    • Standard Streams and File Descriptors [Link]

Week 02 - Introduction to Data Science [Slide]

  • Pandas and Numpy Fundamentals
    • Introduction to Numpy [Link]
    • Boolean Indexing with Numpy [Link]
    • Introduction to Pandas [Link]
    • Exploring Data with Pandas: Fundamentals [Link]
    • Exploring Data with Pandas: Intermediate [Link]
    • Data Cleaning Basics [Link]
    • 🤠 Guided Project - Exploring eBay Car Sales Data [Link]

Week 03 - Clean Code Principles for Data Science and Machine Learning [Slides]

Week 04 Production Ready Code [Slide]

Week 05 Building a Reproducible Model Workflow [Slide]

Week 06 Building a Reproducible Model Workflow Cont. - Introduction to Machine Learning [Slide]

  • Outline [Video]
  • What is Machine Learning? [Video]
  • Machine Learning Types [Video]
  • Variables, Pipeline, Controlling Chaos [Video]
  • Data Segregation - train, dev and test sets [Video]
  • Bias vs Variance [Video]
  • Optional hands on Dataquest.io
    • Machine Learning Fundamentals [Link]

Week 07 Building a Reproducible Model Workflow Cont. - ETL, Data Checks, Data Segregation [Slide]

Week 08 Building a Reproducible Model Workflow Cont. - Train, Validation and Experiment Tracking [Slides]

Week 09 Building a Reproducible Model Workflow Cont. - Final Pipeline, Release and Deploy [Slides]

About

Repository for DCA0305, an undergraduate course about Machine Learning Workflows and Pipelines

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 97.8%
  • Python 2.2%
0