Data Science in the Industry: Continuous Delivery for Machine Learning with Open-Source Tools
This course is available only as a part of subscription plans.
Training duration: 3 hours (Hands-on)
How to maintain data science productivity as well as collaborate effectively and deliver value continuously and seamlessly when bringing machine learning models into production
Continuous Integration / Continuous Delivery (CI/CD) practices for Machine Learning and a new pattern of working that avoids most of the pitfalls of data scientists working in isolation
Small, safe incremental changes to code and models allowing for code to be deployed to production frequently, collecting feedback as we develop
How to coordinate data, model and application code all at the same time as compared to just application code
Instructor Bio:
Christoph Windheuser, PhD
David Johnston, PhD
Eric Nagler
Introduction (60 minutes): What is Continuous Delivery for Machine Learning (CD4ML)?
Doing the plumbing (20 minutes): Set up the Jenkins build pipeline and ensure your project is configured correctly
Data Science (20 minutes): Develop the model and test the code using Test Driven Development methodology
Machine Learning Engineering: Improve the model in several steps and monitor the results of your improvement
Continuous Deployment( 20 minutes): Setup a performance test of the model, which only allows automatic deployment if the model passes a quality threshold
Our app in the wild (20 minutes): Monitor your application in production with fluent, elastic search and kibana
Basic knowledge of how to develop a machine learning model
Basic skills in Python
Familiar with Docker, ElasticSearch Stack, Jenkins