Supervised Learning 5: Missing Data in Supervised ML
This course is available only as a part of subscription plans.
Training duration : 90 minutes
Describe the three main types of missingness patterns
Evaluate simple approaches for handling missing values
Apply XGBoost to a dataset with missing values
Apply multivariate imputation
Apply the reduced-features model (also called the pattern submodel approach)
Decide which approach is best for your dataset
Andras Zsom, PhD
Andras Zsom, PhD
Module 1: Missing data patterns
- MCAR - Missing Complete At Random
- MAR - Missing At Random
- MNAR - Missing Not At Random
Module 2: Apply the reduced-features model (also called the pattern submodel approach)
- Reduced-features model (or pattern submodel approach)
Module 3: How to determine the patterns?
- A python implementation
Module 4: Decide which approach is best for your dataset
- XGB models
- Imputation
- Reduced-features
Experience with python and scikit-learn
Knowledge of building a machine learning pipeline (e.g., cross validation, hyper-parameter tuning)