Course Abstract

Training duration: 4 hours (Hands-on)

Knowing just enough Pandas can mean the difference between exploring and understanding data from a fundamental. Knowing how to transform data, use multi-indexes, and customize Pandas' visual aspects in Jupyter Lab gives you the power to approach everyday problems confidently. In this session, you will build on fundamentals you already know to handle a more comprehensive array of data problems. This session is for anyone who already has a solid foundation of Pandas fundamentals who wants to extend their knowledge with more advanced features, including aggregation, multi-indexing, designing transformations, and customizing DataFrame output.

DIFFICULTY LEVEL: ADVANCED

Learning Objectives

  • Know how to configure Python environments

  • Understand the fundamentals of Pandas

  • Understand when to use Built-in vs. ".apply()" for data transformation

  • Be familiar with common multi-index use cases

  • Understand how to select data with multi-indexes

  • Describe which aspects of a DataFrame can be customized

  • Know how to write callback functions on DataFrame aesthetics

Instructor

Data Science Consultant | Yerrington Consulting

David Yerrington

At the age of 8, David began learning the BASIC programming language while living in Alaska's outskirts. He studied music performance but found the beginning of his career building a small software and consulting company in the late '90s. David's career spans almost 20 years including several startups as a lead engineer building scalable data services from prototype to production. During his time at Sony/Gracenote, he lead the implementation of prototypes featured in the Consumer Electronics Show, spanning problems with recommendation, content classification, and profiling type projects. David also held roles as a data scientist at a YC backed dating app company and an analytics startup researching and building scalable recommendation pipelines. While working at General Assembly as a Lead Global Data Science Instructor, David helped architect the first significant versions of their data science immersive curriculum. Also, he piloted many of the hybrid Data Science Immersive programs still taught today. Currently, David consults and contracts full-time for various clients and projects ranging from NLP, recommendation, big data, and professional training for large and small teams. David enjoys playing the cello in orchestras and a small group that performs classic video game covers when not working.

Course Outline

Module 1:  Introductions, Configuration, and Review

- Configure your Python environments

- Review the fundamentals of Pandas


Module 2:  Aggregations and ".apply()"

- Perform simple aggregations

- Review: DataFrame Axis

- Understand when to use Built-in vs. ".apply()" for data transformation


Module 3:  Mult-indexing

- Describe when multi-indexing makes sense

- Be familiar with common multi-index use cases

- Understand how to select data with multi-indexes


Module 4:  Customizing DataFrame Output

- Describe which aspects of a DataFrame can be customized

- How to write callback functions on DataFrame aesthetics

Background knowledge

  • Grouping

  • Selecting rows or columns of DataFrames with .loc, iloc

  • Boolean Indexing

  • Git and Github

  • Installing Python environments

  • Jupyter Lab