Course curriculum

  • 1

    Machine Learning

    • Data Science and Machine Learning in the Cloud for Cloud Novices by Joy Payton

    • Training and Operationalizing Interpretable Machine Learning Models by Francesca Lazzeri, PhD

    • Finding Correlated Trends across multiple Data Sets using Matrix Factorization by Aedin Culhane, PhD

    • Echo State Networks for Time-Series Data by Teal Guidici, PhD

    • Looking from Above: Object Detection and Other Computer Vision Tasks on Satellite Imagery by Xiaoyong Zhu

    • Pipelining in Python with Snakemake with Biological Applications by Laura A. Seaman, PhD

    • Interpreting and Explaining XGBoost Models by Brian Lucena, PhD

    • Uplift Modeling Tutorial: Predictive and Prescriptive Analytics by Victor Lo, PhD

    • A Data Science Playbook for Explainable AI - Navigating Predictive and Interpretable Models by Joshua Poduska

    • Predictive Maintenance: Zero to Deployment in Manufacturing by Nagdev Amruthnath, PhD

    • AI / Machine Learning Driven Improvement of Demand Forecasts by Prabhakar Narasimhadevara

    • Cloud AI Services: What They are and How to Use Them by Karl Weinmeister

    • Delivering on the Promise of AI in Precision Medicine Oncology by John Mercer

    • Towards a Zero-One Law for Column Subset Selection by David P. Woodruff, PhD

    • Missing Data in Supervised Machine Learning by Andras Zsom, PhD

    • Machine Learning and Artificial Intelligence in 2020: Recent Trends, Technologies, and Challenges by Sebastian Raschka, PhD

    • Planning my Summer Vacation Using Python, Machine Learning and Cloud Services by Brendan Tierney

    • Guided Labeling: Human-in-the-Loop Label Generation with Active Learning and Weak Supervision by Paolo Tamagnini

    • Explainable AI for Training with Weakly Annotated Data by Evan Schwab, PhD

    • Alternatives to Reinforcement Learning for Real World Problems by Byron Galbraith, PhD

    • Algorithms with Predictions by Michael Mitzenmacher, PhD

    • Credit Models and Binning Variables Are Winning and I'm Keeping Score! by Aric LaBarr, PhD

    • Fighting Customer Churn With Data by Carl Gold, PhD

    • Simplify and Scale Data Engineering Pipelines with Open Source Delta Lake by Joshua Cook, Emma Freeman, Jody Soeiro de Faria

    • The What, Why, and How of Weighting by Eric Hart, PhD

    • Automated Feature Engineering for Customer Journey Event Prediction by Srinivas Chilukuri

    • Target Leakage in Machine Learning by Yuriy Guts

    • End to End Modeling and Machine Learning by Jordan Bakerman, PhD & Ari Zitin

    • Deciphering the Black Box: Latest Tools and Techniques for Interpretability by Rajiv Shah, PhD

    • Advances in Julia for Data Science and ML by Jeff Bezanson, PhD

    • Sports Analytics - Leveraging Raw GPS Data for Optimizing Soccer Players' Performance by Christopher Connelly

    • Pocket AI and IoT: Turn Your Phone Into a Smart Fitness Tracker by Louvere Walker-Hannon, Maria Gavilan-Alfonso, Vaidehi Venkatesan

    • Biomarker, Sleep, and Activity Patterns Data from a Web-Based Nutrition Platform for Healthy Individuals: Insights for Personalized Recommendations

    • Machine Learning in R Part I: Penalized Regression and Boosted Trees by Jared Lander

    • Solving the Data Scientist’s Dilemma: the Cold-Start Problem with 10+ Machine Learning Examples by Dr. Kirk Borne

    • Intermediate Machine Learning with scikit-learn by Andreas Mueller, PhD

    • Machine Learning in R Part II: Using workflows to build an ML optimization pipeline by Jared Lander

    • Machine Learning for Trading by Stefan Jansen

    • Validate and Monitor Your AI and Machine Learning Models by Olivier Blais

    • Adapting Machine Learning Algorithms to Novel Use Cases by Dr. Kirk Borne

    • Advanced Machine Learning: Pipelines and Evaluation Metrics by Andreas Mueller, PhD

    • Advanced Machine Learning with scikit-learn: Imbalanced Classification and Text Data by Andreas Mueller, PhD

    • Machine Learning in R Part IV: Putting R-Based Machine Learning into Production with Plumber and Docker by Daniel Chen, PhD

    • Kubernetes: Simplifying Machine Learning Workflows by Alex Corvin, Michael Clifford, and Anish Asthana

    • ML Engineering for Production ML Deployments by Robert Crowe and Irene Giannoumis

  • 2

    ML Ops & Data Engineering

    • Gaining Machine Learning Observability by Josh Benamram, Evgeny Shulman

    • Accelerate AI/ML Workflows in Hybrid Cloud with Red Hat OpenShift Kubernetes Platform and CognitiveScale Certifai by Trevor McKay, Sanjay Kottaram

    • Streaming Decision Intelligence and Predictive Analytics with Spark 3 by Scott Haines

    • Raising Your Analytics from Infancy to Maturity by Dr. Brett Wujek

    • From Graph DBs to Topological Data Analysis: Data Science Applications in Financial Services by Daniel Ferrante, PhD

    • How Retailers Can Automate AI/ML in Minutes by Dr. Aaron Cheng

    • Scaling your ML workloads from 0 to millions of users by Julien Simon

    • Kedro + MLflow – Reproducible and Versioned Data Pipelines at Scale by Tom Goldenberg

    • In the Defense of Data: Delivering Value During a Global Crisis by Alexander Dean

    • Journey to Scalable AI by John Almasan, PhD and William Drake

    • Successfully Build and Scale AI Organizations Beyond the MVP by Sarah Aerni, PhD

    • DevOps for Machine Learning and other Half-Truths: Processes and Tools for the ML Life Cycle by Kenny Daniel

    • AI Operationalization with Governance and Model Risk Management by Sourav Mazumder

    • Deep Dive in Wenju--A Solution Platform for Enterprise AI by Changfeng Charles Wang, PhD

    • Introduction to Apache Airflow by Tomasz Urbaszek, Jarek Potiuk

    • Simplifying Data Science with Delta Lake and MLflow by Matei Zaharia, PhD

    • Graph Powered Machine Learning by Jörg Schad, PhD

    • Ray: A System for High-performance, Distributed Python Applications by Dean Wampler, PhD

    • MLOps – Take Your Data Science Workflows Into Production with MLOps by Shivani Pateland Jordan Edwards

    • "Data Science Best Practices: Continuous Delivery for Machine Learning by Christoph Windheuser, PhD David Johnston, PhD Eric Nagler"

    • It's a Breeze to Contribute to Apache Airflow by Tomasz Urbaszek and Jarek Potiuk

  • 3

    Machine Learning for Programmers

    • Quick Package Development in R and Python (from "Python or R" to "Python and R") by Theodore Bakanas and Zhi Lu, PhD

    • Consume, Control and Serve REST APIs with R by Marck Vaisman

    • The Hamiltonian Monte Carlo Revolution is Open Source: Probabilistic Programming with PyMC3 by Austin Rochford

    • From Research to Production: Performant Cross-platform ML/DNN Model Inferencing on Cloud and Edge with ONNX Runtime by Faith Xu, Prabhat Roy

    • Accelerate ML Lifecycle with Kubernetes and Containerized Data Science Tools by Abhinav Joshi and Tushar Katarki

  • 4

    Data Visualization

    • Smart Technologies in Enhancing Browsing Experiences by Zona Kostic, PhD

    • Methods to Derive Computable Information from Sparse Electronic Medical Record Data: a Guide for the Data Analyst in Biomedicine by Alex S. Felmeister, PhD

    • Deep Neural Networks Assisted Simulation Surrogates for Parameter Space Exploration by Han-Wei Shen, PhD

    • The Art (and Importance) of Data Storytelling by Diedre Downing

    • How to Approach Time Series Forecasting Given Noisy and Sparse Data: an Example from the Trucking Industry by Filip Piasevoli

    • Network Analysis Made Simple by Eric Ma, PhD

    • Data Analysis, Dashboards and Visualization with Tableau - How to Create Powerful Visualizations Like a Zen Master by Nirav Shah

    • Building Data Narratives: An End-to-End Machine Learning Practicum by Paul J. Kowalczyk, PhD

    • Good, Fast, Cheap: How to Do Data Science with Missing Data by Matt Brems

  • 5

    Deep Learning

    • Uncertainty in Deep Learning by Rebecca Russell, PhD

    • Self-Supervised Learning and Natural Language Processing for Hate Speech Detection by Sihem Romdhani

    • Step Up Your PyTorch with Custom Extension by Adam Paszke

    • Continuous Learning Systems: Building ML Systems That Learn from Their Mistakes by Anuj Gupta

    • Graph Neural Networks and their Applications by Shauna Revay, PhD

    • Generating Realistic Data While Preserving Privacy by Joshua Falk

    • Audio Event Detection via Deep Learning in Python by Robert Coop, PhD

    • Deep Learning in Intelligent Process Automation by Slater Victoroff

    • Data, I/O, and TensorFlow: Building a Reliable Machine Learning Data Pipeline by Yong Tang, PhD

    • Best Practices in Deep Learning and the Art of Research Management by Moses Guttmann

    • The Software GPU: Making Inference Scale in the Real World by Nir Shavit, PhD

    • Recent Trends in Conversational AI by Raghav Mani

    • Inversion of 2D Remote Sensing Data to 3D Volumetric Models Using Deep Dimensionality Exchange by Graham Ganssle, PhD

    • Variational Auto-Encoders for Customer Insight by Yaniv Ben-Ami, PhD

    • How I Learned to Stop Worrying and Create Messy Data by Julia Neagu, PhD

    • Distributed Training Platform at Facebook by Mohamed Fawzy, Kiuk Chung

    • Neuroevolution-based Automated Model Building: How to Create Better Models by Keith Moore

    • Hybrid Deep Learning Approach to Speed up Certain Numerical Simulations by Cheng Zhan, PhD

    • Deploying Deep Learning Models as Microservices by Saishruthi Swaminathan

    • Deep Learning for Tabular Data: A Bag of Tricks by Jason McGhee

    • Deep Transfer Learning for Computer Vision: Real-World Applications at Nanoscale by Dipanjan Sarkar & Sachin Dangayach

    • Multi-Channel Optimal Path Sequencing Through Bayesian Deep Learning by Vishal Hawa

    • Modern and Old Reinforcement Learning by Leonardo De Marchi

    • Deep Learning (with TensorFlow 2) by Dr. Jon Krohn

    • Applied Deep Learning: Building a Chess Object Detection Model with TensorFlow by Joseph Nelson

  • 6

    Kickstarter

    • How to Train Your Robot. An Introduction to Reinforcement Learning by Craig Buhr, PhD

    • Introduction to Machine Learning with scikit-learn by Andreas Mueller, PhD

    • Statistics for Data Science by Andrew Zirm, PhD

    • Recommendation Systems in Python by Joshua Bernhard

    • SQL for Data Science by Mona Khalil

    • Introduction to Machine Learning for Time-series Forecasting by Mark Steadman, PhD and Viktor Kovryzhkin

    • Programming with Data: Python and Pandas by Daniel Gerlanc

  • 7

    Open Source

    • Methods for Using Observational Data to Answer Causal Questions by Erich Kummerfeld, PhD

    • When the Bootstrap Breaks by Ryan Harter

    • Teaching Data Science to 20k Students by Robert Schroll, PhD and Nicholas Cifuentes-Goodbody and Don Fox, PhD

    • Actionable Ethics for Data Scientists by Emily Miller

    • Responsible AI – State of the Art and Future Directions by Mehrnoosh Sameki, PhD Minsoo Thigpen and Ehi Nosakhare, PhD

    • fastText Tutorial by Onur Celebi

    • Bayesian Data Science: Probabilistic Programming by Hugo Bowne-Anderson, PhD

    • Introduction to R: Cleaning and Processing Data by Daniel Chen, PhD

    • Actionable Ethics for Data Scientists: A hands-on workshop by Christine Chung and Emily Miller

    • Cool Things You Can Do with PostgreSQL to Next Level Your Data Analysis by Steven Pousty, PhD

  • 8

    Research Frontiers

    • Improving Subseasonal Forecasting in the Western U.S. with Machine Learning by Lester Mackey, PhD

    • Integrating Urban Open Data for Public Good by Yuan Lai, PhD

    • Open Source Tools for Social Impact by Kaushik Mohan, Stuart Lynn, PhD

    • AI-driven Program Synthesis by Armando Solar-Lezama, PhD

    • AI Research at Bloomberg by Dr. Anju Kambadur

    • Outlier Robust Machine Learning by Pradeep Ravikumar, PhD

    • Opening the Pod Bay Doors: Building Intelligent Agents That Can Interpret, Generate and Learn from Natural Language by Jacob Andreas, PhD

    • Web Crawling for Research Data by Shan Jiang, PhD

    • Fast AI: Enabling Rapid Prototyping of AI Solutions by Vijay Gadepally, PhD

    • ParlAI: A Platform for Neural Dialogue Research by Stephen Roller, PhD

  • 9

    NLP

    • Spark NLP for Healthcare: Lessons Learned Building Real-World Healthcare AI Systems by Veysel Kocaman, PhD

    • State of the Art Natural Language Processing at Scale by David Talby, PhD

    • Developing Natural Language Processing Pipelines for Industry by Michael Luk, PhD

    • An Introduction to Transfer Learning in NLP and HuggingFace Tools by Thomas Wolf, PhD

    • Applying State-of-the-art Natural Language Processing for Personalized Healthcare by David Talby, PhD and Guneet Walia, PhD

    • Transform your NLP Skills: Using BERT (and Transformers) in Real Life by Niels Kasch, PhD

    • State-of-the-art NLP Made Easy with AdaptNLP by Brian Sacash, Andrew Chang

    • Level Up: Fancy NLP with Straightforward Tools by Kimberly Fessel, PhD

    • Transfer Learning in NLP by Joan Xiao, PhD

    • Workflow Design for Natural Language Annotation by Teresa O'Neill, PhD

    • [Apple | Organization] and [Oranges | Fruit]: How to Evaluate NLP Tools for Entity Extraction by Gil Irizarry

    • Applied Deep Learning for NLP Applications by Elvis Saravia, PhD

    • Natural Language Processsing using Python by Matt Brems

  • 10

    Computer Vision

    • Using Computer Vision and NLP Together for Fashion Classification Ali by Vanderveld, PhD

    • Building Scalable AI Computer Vision Applications by Dr. Nitin Gupta

    • How to Solve Real-World Computer Vision Problems Using Open-Source by Patrick Buehler, PhD

    • Revolutionizing Property Insurance with Aerial Imagery by Oleg Poliannikov, PhD

  • 11

    Mentor Talks

    • Top Tips for Communicating Your Results to Management by David Meza

    • Picking the Right Program: Formats, Credentials, and MOOCs, Oh My! by Aleksandar Tomic, PhD

    • Making a Career Transition to Data Science by Cathy Chute

    • Ten Themes to Building a Happy, Healthy, and Large Data Science Team by Troy Lau, PhD

    • Transitioning Into Data Science by Laura Seaman, PhD

    • Making Your Data Science Project Better by Thinking About the Use Case by Tommy Blanchard, PhD

    • Data Scientist, or the Most Dangerous Job of the 21st Century by Hugo Bowne-Anderson, PhD

  • 12

    Business Talks

    • Passing the Turing Test in Rare Disease Diagnosis by Dr. John Reynders

    • Reinforcement Learning and Inverse Reinforcement Learning in Finance by Igor Halperin, PhD

    • Creating an Enterprise AI Strategy: If Your Company Isn’t Good At Analytics, It’s Not Ready For AI by David Mariani

    • Artificial Intelligence and Drug Response by Susan Gregurick, PhD

    • AI / ML in Retail Engineering & Operational Excellence by Ravi Kumar Buragapu

    • The Humans in the Loop by Jon Rabinovitz

    • A Tale of Two AI Implementations in Healthcare by Caitlin Monaghan, PhD

    • How to Lead Data Science Teams: the 3 D's of Data Science Leadership by Juan Manuel Contreras, PhD

    • Building and Managing World Class Data Science Teams by Conor Jensen

    • Solving Real-life Challenges in Detecting Cognitive Diseases from Speech using ML by Jekaterina Novikova, PhD

    • Ensure the Quality of Recommendations in a Social Network by Qiannan Yin, PhD & Divya Venugopalan

    • Creating a Systems Change Approach for Data Science & AI Solutions by Jake Porway, PhD

    • MLOps: The Assembly Line of Machine Learning by Jordan Birdsell

    • Outside-in Innovation for An Analytics-Powered Operations by Christian Vogt, PhD

    • AI for the New Electricity by Ram Rajagopal, PhD

    • Agile Data Science: Exploring a Framework to Help a Team Generate Actionable Insight by Jeffrey Saltz, PhD

    • Microstructure Dynamics and ML in Trading by Michael Steliaros, PhD & Andreas Petrides, PhD

    • Data Mastering @Scale by Mike Stonebraker, PhD

    • Convergence and Critical Mass: The Fusion Moment for Biopharma by Brian Martin

    • Accelerating the Enterprise Uptake of AI by Hui Lei, PhD

    • AI for Care Planning Support by Sadid Hasan, PhD

    • How to Audit AI Development by Vadim Pinskiy, PhD

    • Natural Language Processing: Feature Engineering in the Context of Stock Investing by Frank Zhao

    • Challenges and Best Practices in Industrial AI Applications by Xiaohui Hu, PhD

    • How to Apply Machine Learning in Your Company Using Design Thinking and Canvas by Leandro Cesar Lopes

    • AI/ML Operationalization Anti-Patterns by Matt Maccaux

    • Jupyter as an Enterprise "Do It Yourself" (DIY) Analytic Platform by Dave Stuart

    • How Google Uses AI and Machine Learning in the Enterprise by Rich Dutton

    • Data Utilization: The Art of Extracting Valuable Insights by Thomas L. Vincent, PhD

    • GDPR in Action: Does it Work? by Volker Hadamschek, PhD and Reinhold Beckmann

    • Deep Learning Approaches to Forecasting and Planning by Javed Ahmed, PhD

    • From Data Strategy to Deep Learning: Enabling AI Solutions for Life Sciences & Pharmaceuticals by Michael Segala, PhD

    • How to Stop Worrying and Embrace AI Bias by Jett Oristaglio

    • Python + MPP Database = Large Scale AI/ML Projects in Production Faster by Paige Roberts

    • Automated Insights in Finance Using Machine Learning & AI by Dr. Arun Verma

    • Deciphering Brain Codes to Build Smarter AI by Gabriel Kreiman, PhD

  • 13

    Demo Talks

    • Scaling Production ML Pipelines with Databand by Josh Benamram

    • Wenju: A Solution Platform for Enterprise AI by Changfeng Charles Wang, PhD

    • Azure Automated Machine Learning (Introduction and Demos) by Cesar De la Torre

    • The Future of MLOps and How Did We Get Here? by Chris Sterry

    • Sports Analytics - Leveraging Open Source Technology to Improve Athlete Performance by Christopher Connelly

    • Amplifying the User Experience by Defining User Journeys with Snowplow by Nienke Bos

    • Customer Centricity With Deep Learning While Maintaining Privacy by Phil Wennker

    • Managing Open Source Models Just Got a Lot Easier: SAS Open Model Manager® by Marinela Profi

    • Looker & BigQuery - Accelerating Data Science Workflows by Marcell David Babai

    • Launch Geospatial Analysis Demo With Open Source Tools by Alvaro Arredondo

    • Cloud-native Architectures for Data Pipelines by Guillaume Moutier

    • Scalable Machine Learning Using Python and a Distributed Analytical Database by Badr Ouali

    • Using Spark NLP to Enable Real-World Evidence (RWE) and Clinical Decision Support in Oncology by Veysel Kocaman, PhD

    • Reducing Technical Debt with MLOps and the DataKitchen DataOps Platform by Chris Bergh

    • Making the Most of Your Annotation Partnership by Teresa O'Neill, PhD

    • Is Your Organization’s Infrastructure Holding Back Your Adoption of AI at Scale? by Nick Patience

    • Rapid Prototyping ML Microservices Using Agile by Jose Brache

    • Pruning for Success by Mark Kurtz

    • Turn AI From Experiment to Core Products by Meeta Dash

    • Strategies for Building AI-ready Data Sources and (Semi)autonomous Reasoning Agents Operating on Top of Them by Marcin von Grotthuss, PhD

    • Using AutoML to Identify Plant Disease and Leakage in COVID X-Rays by Rajiv Shah, PhD and Abdul Khader Jilani

    • Real-Time Algorithm as a Service: the case of BEST (Semantix Model Experience Platform) by Luiz Fernando Ohara Kamogawa, PhD

  • 14

    ODSC Keynotes

    • Machine Learning at an Inflection Point by John Montgomery, PhD

    • The Ethical Algorithm by Michael Kearns, PhD

    • Bias in the Vision and Language of Artificial Intelligence by Margaret Mitchell, PhD

    • Introducing Graph Convolutional Networks for Finance by Mark Weber

  • 15

    COVID-19

    • Tracking Undetected COVID-19 Infections Using Coronavirus Genomes by Lucy Li, PhD

    • 5 Actions for Data and Analytics leaders in the age of COVID-19 by Fawad Butt

    • CEDAR: Information technology to enhance open science in the fight against COVID-19 by Mark Musen, PhD

    • How Can a Democracy Effectively Respond to COVID-19: Lessons from Taiwan by Jason Wang, PhD

    • Coronavirus After the Curve by Roger W. Thomas

    • Using Large Social Data for COVID-19 by Johannes Eichstaedt, PhD

    • A Clinical Perspective on the Use of AI In the Imaging Diagnosis and Management of COVID-19 by Dr. Eric Siegel

    • AI for COVID-19: Developing the “Corona-Score” for patient monitoring using Deep Learning CT Image Analysis by Hayit Greenspan, PhD

    • COVID-19: Unprecedented Challenges and Opportunities for Data Science (and Scientists) -- Voices, Visions, and Ventures form Harvard Data Science Review

    • Performing Multidimensional Analysis on COVID-19 Data (without requiring Data Engineering! by Daniel Gray