Live training with Boris Paskhaver starts on April 6th at 1 PM (ET)
Training duration: 4 hours (Hands-on)
Instructor Bio:
Boris Paskhaver
Software Engineer | Stride Consulting
Boris Paskhaver
Boris Paskhaver is a full-stack web developer based in New York City with experience building apps in React / Redux and Ruby on Rails. His favorite part of programming is the never-ending sense that there's always something new to master --- a secret language feature, a popular design pattern, an emerging library or -- most importantly -- a different way of looking at a problem.
30% discount ends in:
-
00 Days
-
00 Hours
-
00 Minutes
-
00 Seconds
By the end of the course, participants will be able to:
-
Apply concepts to a NEW dataset , and watch and practice the concepts.
-
Have a solid grasp of the capabilities of the pandas library.
-
Perform various data manipulations - sorting, joining, cleaning, aggregating, deduping, and more.
Course Abstract
This tutorial offers a comprehensive introduction to the powerful pandas library for data analysis built on top of the Python programming language. Pandas represents a great step forward for graphical spreadsheet users looking to grow their data manipulation skills. I like to call it "Excel on steroids".
By completing this workshop, you'll have a strong foundation for using Pandas in your day-to-day data analysis needs. We'll start out with the basics -- importing datasets, selecting rows and columns, filtering rows by criteria -- and progress to advanced concepts like grouping values, joining multiple datasets together, and cleaning text.
Students will be exposed to diverse data sets across different disciplines -- sports, finance, entertainment, and more. The training is open to all industries and is targeted for beginners --- basic Python knowledge is preferred but not required.
Course Outline
Lesson 1. Foundations of pandas:
- Discover the 1-dimensional Series and the 2-dimensional DataFrame, the two core data structures in pandas.
- Learn how to sort values across one or more columns, identify missing values, remove duplicates, count occurrences of values, filter rows based on one or more criteria, and more.
- At the end of this lesson, you'll have knowledge of the most popular features of pandas.
Lesson 2. Working with Data of Different Types
- Data can come in a variety of formats (and be messy to boot)! In this lesson, we'll explore how to convert columns from one data type to another.
- We'll optimize our dataset to reduce memory consumption.
- We'll discover how to clean messy text data and how to extract date-time information from text.
- Finally, we'll have a chance to review all concepts from the previous lesson with new datasets.
Lesson 3. Working with Text Data
- Real-world text data can be riddled with issues -- whitespace, letter casings, inconsistent formats, and more.
- In this lesson, we'll learn how to to clean text data in pandas.
- We'll apply text operations like splitting, replacing, and joining to whole columns of data.
- We'll conclude with a quick discussion on regular expressions, which allow us to define search patterns for text.
Lesson 4. Aggregating and Joining Datasets
- In this section, we'll learn how to merge data across multiple datasets.
- We'll explore the pandas equivalent of common SQL operations like inner joins, outer joins, left joins, and right joins.
- We'll also introduce the GroupBy object for grouping rows by shared values across one or more columns.
- Finally, we'll walk through common aggregation operations like pivoting, melting, stacking, unstacking, and more.
Which knowledge and skills you should have?
-
Basic/intermediate experience with spreadsheet software (Excel, Google Sheets, etc.)
-
Basic experience with Python programming language
What is included in your ticket?
-
Access to live training and QA session with the Instructor
-
Access to the on-demand recording
-
Certificate of completion