Machine Learning Foundations

Machine Learning Foundations : Probability and Statistics

This course is available as a part of subscription plans

Abstract

1. Probability & Information Theory

This class, Probability & Information Theory, introduces the mathematical fields that enable us to quantify uncertainty as well as to make predictions despite uncertainty. These fields are essential because machine learning algorithms are both trained by imperfect data and deployed into noisy, real-world scenarios they haven’t encountered before.

Through the measured exposition of theory paired with interactive examples, you’ll develop a working understanding of variables, probability distributions, metrics for assessing distributions, and graphical models. You’ll also learn how to use information theory to measure how much meaningful signal there is within some given data. The content covered in this class is itself foundational for several other classes in the Machine Learning Foundations series, especially Intro to Statistics and Optimization.

Over the course of studying this topic, you'll:

Develop an understanding of what’s going on beneath the hood of predictive statistical models and machine learning algorithms, including those used for deep learning.
Understand the appropriate variable type and probability distribution for representing a given class of data, as well as the standard techniques for assessing the relationships between distributions.
Apply information theory to quantify the proportion of valuable signal that’s present amongst the noise of a given probability distribution.

2. Intro to Statistics

This class, Intro to Statistics, builds on probability theory to enable us to quantify our confidence about how distributions of data are related to one another.

Through the measured exposition of theory paired with interactive examples, you’ll develop a working understanding of all of the essential statistical tests for assessing whether data are correlated with each other or sampled from different populations -- tests which frequently come in handy for critically evaluating the inputs and outputs of machine learning algorithms. You’ll also learn how to use regression to make predictions about the future based on training data.

The content covered in this class builds on the content of other classes in the Machine Learning Foundations series (linear algebra, calculus, and probability theory) and is itself foundational for the Optimization class.

Over the course of studying this topic, you'll:

Develop an understanding of what’s going on beneath the hood of predictive statistical models and machine learning algorithms, including those used for deep learning.
Hypothesize about and critically evaluate the inputs and outputs of machine learning algorithms using essential statistical tools such as the t-test, ANOVA, and R-squared.
Use historical data to predict the future using regression models that take advantage of frequentist statistical theory (for smaller data sets) and modern machine learning theory (for larger data sets), including why we may want to consider applying deep learning to a given problem.

DIFFICULTY LEVEL: BEGINNER

Instructor Bio:

Dr Jon Krohn

Chief Data Scientist, Author of Deep Learning Illustrated | untapt

Dr. Jon Krohn

Jon Krohn is Chief Data Scientist at the machine learning company untapt. He authored the 2019 book Deep Learning Illustrated, an instant #1 bestseller that was translated into six languages. Jon is renowned for his compelling lectures, which he offers in-person at Columbia University, New York University, and the NYC Data Science Academy. Jon holds a Ph.D. in neuroscience from Oxford and has been publishing on machine learning in leading academic journals since 2010; his papers have been cited over a thousand times.

Course Outline

1: Introduction to Probability

What Probability Theory Is
A Brief History: Frequentists vs Bayesians
Applications of Probability to Machine Learning
Random Variables
Discrete vs Continuous Variables
Probability Mass and Probability Density Functions
Expected Value
Measures of Central Tendency: Mean, Median, and Mode
Quantiles: Quartiles, Deciles, and Percentiles
The Box-and-Whisker Plot
Measures of Dispersion: Variance, Standard Deviation, and Standard Error
Measures of Relatedness: Covariance and Correlation
Marginal and Conditional Probabilities
Independence and Conditional Independence

2: Distributions in Machine Learning

Uniform
Gaussian: Normal and Standard Normal
The Central Limit Theorem
Log-Normal
Exponential and Laplace
Binomial and Multinomial
Poisson
Mixture Distributions
Preprocessing Data for Model Input

3: Information Theory

What Information Theory Is
Self-Information
Nats, Bits and Shannons
Shannon and Differential Entropy
Kullback-Leibler Divergence
Cross-Entropy

4: Frequentist Statistics

Frequentist vs Bayesian Statistics
Review of Relevant Probability Theory
z-scores and Outliers
p-values
Comparing Means with t-tests
Confidence Intervals
ANOVA: Analysis of Variance
Pearson Correlation Coefficient
R-Squared Coefficient of Determination
Correlation vs Causation
Correcting for Multiple Comparisons

5: Regression

Features: Independent vs Dependent Variables
Linear Regression to Predict Continuous Values
Fitting a Line to Points on a Cartesian Plane
Ordinary Least Squares
Logistic Regression to Predict Categories

6: Bayesian Statistics

(Deep) ML vs Frequentist Statistics
When to use Bayesian Statistics
Prior Probabilities
Bayes’ Theorem
PyMC3 Notebook
Resources for Further Study of Probability and Statistics

Have questions?

GET IN TOUCH >>