PAST LIVE TRAINING: May 11th: Hands-on Intro to Unsupervised Learning

Live training with Ankur Patel starts on May 11th at 12 PM (ET)

Training duration: 4 hours (Hands-on)

Price with 30% discount

Regular Price: $210.00

$147.00

RESERVE YOUR SPOT

Subscribe now and start 7-day free trial

Sign-up for a Premium Plan and Get 10-35% Additional Discount Live Training

VIEW PLANS

Instructor Bio:

Ankur Patel

Co-founder and Head of Data | Glean

Ankur Patel

Ankur Patel is the co-founder and Head of Data at Glean. Glean uses NLP to extract data from invoices and generate vendor spend intelligence for clients. Ankur is an applied machine learning specialist in both unsupervised learning and natural language processing, and he is the author of Hands-on Unsupervised Learning Using Python: How to Build Applied Machine Learning Solutions from Unlabeled Data and Applied Natural Language Processing in the Enterprise: Teaching Machines to Read, Write, and Understand. Previously, Ankur led teams at 7Park Data, ThetaRay, and R-Squared Macro and began his career at Bridgewater Associates and J.P. Morgan. He is a graduate of Princeton University and currently resides in New York City.

10% discount ends in:

00 Days
00 Hours
00 Minutes
00 Seconds

RESERVE YOUR SPOT

DIFFICULTY LEVEL: INTERMEDIATE

Course Abstract

In the first part, we will explore one of the core concepts in unsupervised learning, dimensionality reduction. Dimensionality reduction serves two main purposes. First, it reduces the computational complexity of working with very large datasets. Second, it removes the non-relevant information in a dataset, surfacing the information that matters most. We will use dimensionality reduction algorithms to build an anomaly detection system; specifically, we will build a system to detect credit card fraud without using any labels. Anomaly detection systems are widely used in industry today to detect all types of rare events such as fraud (e.g., credit card, wire, cyber, insurance), crime (e.g., hacking, money laundering, drug, arms, and human trafficking), and adverse events (e.g., financial market meltdowns, cardiac events, and spikes in online traffic). In the second part, we will explore one of the core concepts in unsupervised learning, clustering. Clustering is able to segment entities (e.g., users) into distinct and homogenous groups such that members of a group are very similar to members within the group but distinctly different from members in other groups. This group segmentation is possible without requiring any labels whatsoever and instead relies on separating entities based on behavior. For example, via clustering, online shoppers could be grouped into budget-conscious shoppers, high-end shoppers, frequent shoppers, seasonal shoppers, technophiles, audiophiles, sneakerheads, back-to-school shoppers, young parents, senior citizens, and millennials. To perform clustering well, good feature engineering is required. In this course, we will explore loan applications, perform feature engineering, and segment users based on their potential creditworthiness. We will also explore how clustering allows efficient labeling, turning unlabeled problems into labeled ones, opening up the realm of semi-supervised learning.

Course Outline

Lesson 1: Introduction to Unsupervised Learning

How unsupervised learning fits into the machine learning ecosystem
Common problems in machine learning: insufficient labeled data, curse of dimensionality, and outliers

Lesson 2: Introduction to Dimensionality Reduction

Motivation for dimensionality reduction: reduce computational complexity of large data, remove non-relevant information and surface salient information, perform anomaly detection, perform clustering
Linear Dimensionality Reduction Algos
Non-linear Dimensionality Reduction Algos

Lesson 3: Application: Anomaly Detection

Introduce use case: credit card fraud detection
Explore and prepare the data
Define evaluation function
Apply linear dimensionality reduction and evaluate results
Apply non-linear dimensionality reduction and evaluate results

Lesson 4: Introduction to Clustering

Why the need for clustering is exists / the real world motivation
How to find patterns in data with zero or few labels
How to efficiently label data when only few labels are available

Lesson 5: Overview of Clustering Algorithms

K-Means
Hierarchical clustering
DBSCAN
HDBSCAN
Apply to MNIST and Fashion MNIST datasets
Visualize clusters and evaluate results

Lesson 6: Application: Group Segmentation

Introduce use case: loan applications
Explore and prepare the data
Define evaluation function
Apply clustering algorithms and evaluate results

Have questions?

GET IN TOUCH

Which knowledge and skills you should have?

Python coding experience
Familiarity with pandas, numpy, and scikit-learn
Understanding of basic machine learning concepts, including supervised learning
Experience with deep learning and frameworks such as TensorFlow or PyTorch is a plus

What is included in your ticket?

Access to live training and QA session with the Instructor
Access to the on-demand recording
Certificate of completion

Upcoming Live Training & Recordings

Access all live training

ACCESS NOW