Live training with Ankur Patel starts on October 6th at 12 PM (EST)

Training duration: 3 hours (Hands-on)

Price with 10% discount

Regular Price: $210.00

Subscribe now and start 7-day free trial

Sign-up for Premium Plan and Get 10-35% Additional Discount Live Training

Instructor Bio:

Co-founder and Head of Data | Glean

Ankur Patel

Ankur Patel is the co-founder and Head of Data at Glean. Glean uses NLP to extract data from invoices and generate vendor spend intelligence for clients. Ankur is an applied machine learning specialist in both unsupervised learning and natural language processing, and he is the author of Hands-on Unsupervised Learning Using Python: How to Build Applied Machine Learning Solutions from Unlabeled Data and Applied Natural Language Processing in the Enterprise: Teaching Machines to Read, Write, and Understand. Previously, Ankur led teams at 7Park Data, ThetaRay, and R-Squared Macro and began his career at Bridgewater Associates and J.P. Morgan. He is a graduate of Princeton University and currently resides in New York City.

10% discount ends in:

  • 00 Days
  • 00 Hours
  • 00 Minutes
  • 00 Seconds

Learning Objectives

  • Understand NLP from first principles, progressing from basic fundamentals to state-of-the-art NLP

  • Use three popular modern open source NLP libraries (spaCy, fast.ai, and Hugging Face) to build NLP applications

  • Learn about pretrained language models, transfer learning, fine tuning, word embeddings, recurrent neural networks, long short-term memory, gated recurrent units, attention mechanisms, and transformers in detail

DIFFICULTY LEVEL: INTERMEDIATE

Course Outline

1: Introduction to NLP

  • What is NLP and history of NLP over past 70 years
  • Popular NLP applications today
  • Introduce first modern open-source NLP software library: spaCy
  • Perform basic NLP tasks using spaCy: tokenization, part-of-speech tagging, dependency parsing, chunking, lemmatization, and stemming


 2: State-of-the-Art (SOTA) NLP

  • Introduce two more modern open-source NLP software libraries: fast.ai and Hugging Face
  • Discuss attention mechanisms, transformers, pretrained language models, transfer learning, and fine-tuning
  • Build NLP application (IMDb movie review sentiment analysis) using fast.ai


3: Progress in NLP over the past 10 years

  • The path to NLP’s watershed “ImageNet” moment in 2018
  • Word embeddings: one-hot encoding, word2vec, GloVe, fastText, and context-aware pretrained word embeddings
  • Sequential models: vanilla recurrent neural networks (RNNs), long short-term memory (LSTMs), and gated recurrent units (GRUs)
  • Attention mechanisms and transformers
  • ULMFiT, ELMo, BERT, BERTology, GPT-1, GPT-2, and GPT-3
  • Introduction to common NLP tasks via Hugging Face: sequence classification, question answering, language modeling, text generation, named entity recognition, summarization, and translation


4: Named entity recognition and text classification applications

  • Explore dataset: AG news dataset
  • Application #1: Named Entity Recognition (NER)
  • Perform inference using out-of-the-box spaCy NER model
  • Annotate data using Prodigy
  • Develop custom named entity recognition model using spaCy
  • Compare custom NER model against the out-of-the-box spaCy NER model
  • Application #2: Text Classification
  • Annotate data using Prodigy
  • Develop text classification model using spaCy

Course Abstract

This hands-on course is organized into four lessons. In lesson one, we will provide an introduction to NLP, reviewing its evolution over the past 70 years. We will explain why NLP matters and how it powers many of the most popular applications we use every day. We will also perform basic NLP tasks using one of the most popular open-source NLP libraries today: spaCy. In lesson two, we will introduce two more popular open-source NLP libraries (fast.ai and Hugging Face) and perform state-of-the-art NLP. We will develop a sentiment analysis model for IMDb movie reviews. We will also cover modern NLP concepts such as attention mechanisms, transformers, pretrained language models, transfer learning, and fine-tuning. In lesson three, we will retrace how NLP advanced over the last decade and experienced its breakout moment in 2018. Since 2018, NLP has soared in popularity among companies and has become a mainstream topic of interest. After we cover the theory, we will discuss modern NLP tasks such as sequence classification, question answering, language modeling, text generation, named entity recognition, summarization, and translation. In lesson four, we will put this theory to practice and develop our own named entity recognition and text classification models using spaCy, including annotating our data using an annotation platform called Prodigy. We will draw on what we’ve learned to perform transfer learning and fine-tuning, and we will compare the fine-tuned model’s performance against an out-of-the-box named entity recognition model. By the end of this course, you should have a good understanding of the fundamental concepts in NLP, both in theory and in practice.

What background knowledge you should have?

  • Python coding experience

  • Familiarity with pandas, numpy, and scikit-learn

  • Experience with deep learning and frameworks such as TensorFlow or PyTorch is a plus

  • Understanding of basic machine learning concepts, including supervised learning

What is included in your ticket?

  • Access to live training and QA session with the Instructor

  • Access to the on-demand recording

  • Certificate of completion

Upcoming Live Training & Recordings

Access all live training