Learning Objectives

  • How to visualize missing data and identify the three different types of missing data

  • How missing data affect whether we should avoid, ignore, or account for the missing data

  • Advantages and disadvantages of each approach

  • How to visualize and implement approaches

  • Practical tips for working with missing data

  • Recommendations for integrating it with your workflow!

Course Outline

Module 1: An introduction to missing data 


Module 2: Strategies for doing data science with missing data 

- Avoid missing data 

- Ignore missing data

- Account for missing data        

- Unit missingness        

- Item missingness 


Module 3: Practical considerations and warnings

Instructor's Bio: Matt Brems

Matt is currently Managing Partner and Principal Data Scientist at BetaVector. His full-time professional data work spans finance, education, consumer-packaged goods, and politics and he earned General Assembly's 2019 "Distinguished Faculty Member of the Year" award. Matt earned his Master's degree in statistics from Ohio State. Matt is passionate about responsibly putting the power of machine learning into the hands of as many people as possible and mentoring folx in data and tech careers. Matt also volunteers with Statistics Without Borders and currently serves on their Executive Committee as the Marketing & Communications Director.

Who will be interested in this course?

  • This course is for current and aspiring Data Scientists, Data Analysts, Machine Learing Engineers and AI Product Managers

  • Knowledge of following tools and concepts is useful:

  • Familiarity with Python and Jupyter notebooks

  • Some knowledge of Pandas library is useful, but not required