Foundations of Machine Learning : Mini Bootcamp by Dr. Jon Krohn

Foundations for Machine Learning

LINEAR ALGEBRA I CALCULUS I STATISTICS I DATA STRUCTURES I

BEGIN ( INCLUDED WITH PREMIUM ANNUAL ) UPGRADE

ON-DEMAND: Expert-led training with Jon Krohn

Author of Deep Learning Illustrated

Consists of 14-part ON-DEMAND training modules, this course provides a comprehensive overview of all of the subjects --across mathematics, statistics, and computer science --that underlie contemporary machine learning approaches, including deep learning and other artificial intelligence techniques.

Why Take this Course?

If you use high-level software libraries (e.g., scikit-learn, Keras, TensorFlow, PyTorch) to train or deploy machine learning algorithms, and would like now to understand the fundamentals underlying the abstractions, enabling you to expand your capabilities.

Access for Free with Premium Plan

UPGRADE START BOOTCAMP

Instructor's Bio: Dr. Jon Krohn

Jon Krohn is Chief Data Scientist at the machine learning company, Untap. He authored the 2019 book Deep Learning Illustrated, an instant #1 bestseller that was translated into six languages. Jon is renowned for his compelling lectures, which he offers in-person at Columbia University, New York University, and the NYC Data Science Academy. Jon holds a Ph.D. in Neuroscience from Oxford and has been publishing on machine learning In leading academic journals since 2010; his papers have been cited over a thousand times.

Watch Intro Video

BOOTCAMP OVERVIEW BY DR. JON KROHN

Course Outline

1. Linear Algebra Course (3 modules)

Intro to Linear Algebra

Linear Algebra II: Matrix Operations

2. Calculus Course (4 modules)

Calculus I: Limits & Derivatives

Calculus II: Partial Derivatives & Integrals

3. Probability and Statistics Course (4 modules)

Probability and Information Theory

Intro to Statistics

4. Computer Science (3 modules)

Algorithms and Data Structures

Optimization

Linear Algebra

On-Demand Access

ACCESS LINEAR ALGEBRA COURSE

1. Data Structures for Algebra

What Linear Algebra Is
A Brief History of Algebra
Tensors
Scalars
Vectors and Vector Transposition
Norms and Unit Vectors
Basis, Orthogonal, and Orthonormal Vectors
Arrays in NumPy
Matrices
Tensors in TensorFlow and PyTorch

2. Common Tensor Operations

Tensor Transposition
Basic Tensor Arithmetic
Reduction
The Dot Product
Solving Linear Systems

3. Matrix Properties

The Frobenius Norm
Matrix Multiplication
Symmetric and Identity Matrices
Matrix Inversion
Diagonal Matrices
Orthogonal Matrices

4. Eigendecomposition

Eigenvectors
Eigenvalues
Matrix Determinants
Matrix Decomposition
Application of Eigendecomposition

5. Matrix Operations for Machine Learning

Singular Value Decomposition (SVD)
The Moore-Penrose Pseudoinverse
The Trace Operator
Principal Component Analysis (PCA): A Simple Machine Learning Algorithm
Resources for Further Study of Linear Algebra

Calculus

On-Demand Access

ACCESS CALCULUS COURSE

1. Limits

What calculus is
A Brief History of Calculus
The Method of Exhaustion
Matrix Decomposition
Application of Eigendecomposition

2. Computing Derivatives with Differentiation

The Delta Method
Basic Derivative Properties
The Power Rule
The Sum Rule
The Product Rule
The Quotient Rule
The Chain Rule

3. Automatic Differentiation

AutoDiff with Pytorch
AutoDiff with TensorFlow 2
Relating Differentiation to Machine Learning
Cost (or Loss) Functions
The Future: Differentiable Programming

4. Gradients Applied to Machine Learning

Partial Derivatives of Multivariate Functions
The Partial-Derivative Chain Rule
Cost (or Loss) Functions
Gradients
Gradient Descent
Backpropagation
Higher-Order Partial Derivatives

5. Integrals

Binary Classification
The Confusion Matrix
The Receiver-Operating Characteristic (ROC) Curve
Calculating Integrals Manually
Numeric Integration with Python
Finding the Area Under the ROC Curve
Resources for Further Study of Calculus

Probability and Statistics

On-demand Access

ACCESS PROBABILITY & STATISTICS COURSE

1. Introduction to Probability

What Probability Theory Is
A Brief History: Frequentists vs Bayesians
Applications of Probability to Machine Learning
Random Variables
Discrete vs Continuous Variables
Probability Mass and Probability Density Function
Expected Value
Measures of Central Tendency: Mean, Median, and Mode
Quantiles: Quartiles, Deciles, and Percentiles
The Box-and-Whisker Plot
Measures of Dispersion: Variance, Standard Deviation, and Standard Error
Measures of Relatedness: Covariance and Correlation
Marginal and Conditional Probabilities
Independence and Conditional Independence

2. Distribution in Machine Learning

Uniforms

Gaussian: Normal and Standard Normal
The Central Limit Theorem
Log-Normal
Binominal and Multinomial
Poisson
Mixture Distributions
Preprocessing Data for Model Input

3. Information Theory

What Information Theory Is
Self-Information
Nats, Bits and Shannons
Shannon and Differential Entropy
Kullback-Leibler Divergence
Cross-Entropy

4. Frequentist Statistics

Frequentist vs Bayesian Statistics
Review of Relevant Probability Theory
z-scores and Outliers
p-values
Comparing Means with t-tests
Confidence Intervals
ANOVA: Analysis of Variance
Pearson Correlation Coefficient
R-Squared Coefficient of Determination
Correlation vs Causation
Correcting for Multiple Comparisons

5. Regression

Features: Independent vs Dependent Variables
Linear Regression to Predict Continuous Values
Fitting a Line to Points on a Cartesian Plane
Ordinary Least Squares
Logistic Regression to Predict Categories
(Deep) ML vs Frequentist Statistics

6. Bayesian Statistics

When to use Bayesian Statistics
Prior Probabilities
Bayes’ Theorem
PyMC3 Notebook
Resources for Further Study of Probability and Statistics

Computer Science

On-demand Access

ACCESS COMPUTER SCIENCE COURSE

1. Introduction to Data Structures and Algorithms

A Brief History of Data
A Brief History of Algorithms
“Big O” Notation for Time and Space Complexity

2. Lists and Dictionaries

List-Based Data Structures: Arrays, Linked Lists, Stacks, Queues, and Deques
Searching and Sorting: Binary, Bubble, Merge, and Quick
Set-Based Data Structures: Maps and Dictionaries
Hashing: Hash Tables, Load Factors, and Hash Maps

3. Trees and Graphs

Trees: Decision Trees, Random Forests, and Gradient-Boosting (XGBoost)
Graphs: Terminology, Directed Acyclic Graphs (DAGs)
Resources for Further Study of Data Structures & Algorithms

4. The Machine Learning Approach to Optimization

The Statistical Approach to Regression: Ordinary Least Squares
When Statistical Approaches to Optimization Breakdown
The Machine Learning Solution

5. Gradient Descent

Objective Functions
Cost / Loss / Error Functions
Minimizing Cost with Gradient Descent
Learning Rate
Critical Points, incl. Saddle Points
Gradient Descent from Scratch with PyTorch
The Global Minimum and Local Minima
Mini-Batches and Stochastic Gradient Descent (SGD)
Learning Rate Scheduling
Maximizing Reward with Gradient Ascent

6. Fancy Deep Learning Optimizers

A Layer of Artificial Neurons in PyTorch
Jacobian Matrices
Hessian Matrices and Second-Order Optimization
Momentum
Nesterov Momentum
AdaGrad
AdaDelta
RMSProp
Adam
Nadam
Training a Deep Neural Net
Resources for Further Study

Prerequisites

Programming: All code demos will be in Python, so experience with it or another object-oriented programming language would be helpful for following along with the code examples.

Mathematics: Familiarity with secondary school-level mathematics will make the class easier to follow along with. If you are comfortable dealing with quantitative information -- such as understanding charts and rearranging simple equations -- then you should be well prepared to follow along with all the mathematics.

Interested in learning more?

SEE MORE INFORMATION