Description
For many of us, the data ingestion journey begins with a single, magical line: df.to_sql().
This starting point is great for ML one-offs but for data ingestion pipelines this often becomes a production nightmare. The ad-hoc scripts are brittle, memory-hungry, and fail silently, creating a cycle of constant firefighting.
This hands-on workshop is a recovery plan designed to replace these bad habits with best-practice, professional, resilient patterns.
In a guided, interactive notebook, we will look at all typical challenges of data ingestion and how we can solve them in a quick, easy way with dlt. You will learn how to build pipelines using:
- Schema evolution and self healing
- Memory, disk management
- Async and parallelism
- incremental loading and state management
- Declarative REST clients
You'll leave this session with a practical toolkit and a new default workflow, ready to build data systems you can finally trust.
Instructor's Bio

David Hoyle, PhD
Research Data Science Specialist at D unnhumby
David Hoyle is a Research Data Science Specialist at dunnhumby, where he builds demand forecasting models and algorithms for grocery retailers worldwide. He has worked as a Data Scientist in the commercial sector for 14 years. Before that he was an Associate Professor in Bioinformatics and Machine Learning at the University of Manchester, UK, and at the University of Exeter, UK. He has a PhD in theoretical physics and continues to publish academic articles on various mathematical, statistical, machine learning, and AI topics. He also enjoys explaining mathematical and statistical concepts, showing how they are relevant to real-world problems and challenges, and so he regularly writes posts for his own blog and for dunnhumby. He also does pro bono Data Science work through his own consultancy, Oakthorpe Consulting. He is heavily involved with the UK Royal Statistical Society, acting as Data Science and AI board member for the society’s annual international conference. He has recently authored a book on mathematics for Data Science – “15 Math Concepts Every Data Scientist Should Know’’ – published by Packt.
Webinar
-
1
Training "Introduction to Math for Data Science"
-
Ai+ Training
FREE PREVIEW -
Training recording
-
Prerequisites
-