The PyData ecosystem has grown to millions of data science users, who appreciate its ease of use, consistent syntax, and breadth of features. Traditionally, PyData frameworks were only executable on CPUs, making it difficult for users to take advantage of the increasingly-powerful GPUs that have already revolutionized deep learning and related fields. In this session, we'll introduce RAPIDS, an open source framework that brings transparent GPU backends to popular Python APIs, such as those from Pandas, scikit-learn, and NetworkX. We'll show how you can port a huge range of existing workloads to GPU in a matter of minutes and get speedups on the order of 40x or more for common workloads.

The session will emphasize both data preparation (ETL) and machine learning operations, with a hands-on demonstration of porting a typical workflow from CPU to GPU and measuring the speedup. We’ll go into more detail on real-world applications taking advantage of these speed improvements, including hyperparameter optimization for machine learning models, single cell genomics analysis, and applications in finance. For large-data users, we’ll discuss some of the options for scaling RAPIDS to multiple GPUs or multiple nodes, emphasizing the tight integration with the Dask ecosystem.


New on-demand courses are added weekly

Workshop Overview

  • 1

    ODSC West 2020: GPU-accelerated Data Science with RAPIDS

    • Workshop Overview and Author Bio

    • Before you get started: Prerequisites and Resources

    • GPU-accelerated Data Science with RAPIDS

Instructor Bio:

Director, GPU-accelerated machine learning | NVIDIA

John Zedlewski

John Zedlewski is the director of GPU-accelerated machine learning on the NVIDIA Rapids team. Previously, he worked on deep learning for self-driving cars at NVIDIA, deep learning for radiology at Enlitic, and machine learning for structured healthcare data at Castlight. He has an MA/ABD in economics from Harvard with a focus in computational econometrics and an AB in computer science from Princeton.

Sr. Data Scientist & Engineer | NVIDIA

Corey Nolet

Corey Nolet is a Data Scientist & Senior Engineer on the RAPIDS cuML team at NVIDIA, where he focuses on building and scaling machine learning algorithms to support extreme data loads at light-speed. Prior to working at NVIDIA, Corey spent over a decade building massive-scale analytics & data science platforms for HPC environments for the defense industry. Corey currently holds Bs. & Ms. degrees in Computer Science and is pursuing his PhD in the same discipline, focused on scaling machine learning algorithms in distributed architectures. Corey has a passion for using data to make better sense of the world.