Course Abstract

Training duration : 4 hours

Modeling is a fundamental aspect of any science. This fact is particularly apparent in data science. The key aspects of modeling that make it important for science are: (1) models are representations of things that cannot be fully understood or known (e.g., predictive models are essential to predict a future outcome, unless you have access to a time machine that we don’t know about); (2) models can give us new insights into those things, including their behaviors, responses, and characteristics (especially in previously unseen conditions), thereby potentially revealing causal factors for observed outcomes and informing prescriptive actions to optimize outcomes; (3) models provide testable predictions to validate our assumptions and hypotheses about things (otherwise, it’s not science); and (4) models can help answer questions that are not otherwise answerable (e.g., we can pose “what if” scenarios safely in a model environment that we would not be able or allowed to test in a real life situation). In data science, we use observation (data, evidence) to inform and inspire our models, we use machine learning (algorithms that learn from patterns in the data) to build testable models, and we use the scientific method to verify, validate, and/or refine our models. The ideal goal of these activities is discovery from data, specifically actionable insights discovery. This two-part course (DS 101 and DS 102) on modeling and machine learning in data science follows two main threads: foundational concepts and practical examples "

DIFFICULTY LEVEL: BEGINNER

Learning Objectives

  • Identify modeling opportunities and categorize them (Detect, Discover, Predict, Optimize)

  • Select appropriate modeling components and ML algorithms for specific use cases

  • Design your own analytics solutions

  • Solve a broad range of problems

  • Generate value from the data assets in your organization

  • Communicate to your stakeholders the importance and meaning of models in data-intensive environments

  • Attendees will also work on data literacy exercises and study exploratory data analysis use cases that broaden and deepen one’s understanding and abilities in insights discovery and value creation from data.

Instructor

Instructor Bio:

Principal Data Scientist And Executive Advisor, Booz Allen Hamilton

Dr. Kirk Borne

Dr. Kirk Borne is the Principal Data Scientist, Data Science Fellow, and an Executive Advisor at global technology and consulting firm Booz Allen Hamilton. He has worked there since 2015. He provides thought leadership, mentoring, training, and consulting activities in data science, machine learning, and AI across multiple disciplines. Previously, he was Professor of Astrophysics and Computational Science at George Mason University for 12 years in the graduate (Computational Science and Informatics) and undergraduate (Data Science) degree programs. He was a co-creator of the world’s first undergraduate data science degree program in 2006. Prior to that, he spent nearly 20 years supporting data systems activities for NASA space science programs, including a role as NASA's Data Archive Project Scientist for the Hubble Space Telescope and as Contract Program Manager in NASA’s Space Science Data Operations Office. He is co-author of the e-book “10 Signs of Data Science Maturity”, co-author on a new book “Demystifying AI for the Enterprise”, and author of many hundreds of scientific research articles and blogs. Dr. Borne has degrees in physics (B.S., LSU) and astronomy (Ph.D., Caltech). He is an elected Fellow of the International Astrostatistics Association for his contributions to big data research in astronomy. In 2020, he was elected a Fellow of the American Astronomical Society for lifelong contributions to the field of astronomy. As a global speaker, he has given hundreds of invited talks worldwide, including keynote presentations at dozens of data science, AI, and analytics conferences. He is an active contributor on social media, where he promotes data literacy for all and has been named consistently among the top worldwide social influencers in big data, data science, machine learning, and AI since 2013.

Course Outline

Module 1: 
Training Overview and Data Science Preliminaries
Introduction to Modeling Concepts
Supervised vs. Unsupervised Modeling

Module 2: 
Insights Discovery and Generalization
Supervised Learning Concepts 
Predictive vs. Prescriptive Modeling
What does Cognitive have to do with it?

Module 3:
The Two Most Important Things in Data Science
Optimization and Feedback Loops in Modeling
Cold-Start Modeling: When the Data Becomes the Model (Unsupervised ML)
Machine Learning vs. Deep Learning

Module 4:
Common Business Modeling Examples
The OODA Loop in Decision Science and Data Science
When Predictive Modeling Fails
Ethical Modeling 
Enriching Your Models with Smart Data (Semantic Tags, Labels, Annotations)
Exploiting High-Variety Data to Achieve Better Model Outcomes
Steps to Data Analytics Mastery

Background knowledge

  • Data scientists, data analysts, business intelligence practitioners, data users, and other analytics-related professionals are the target audience for this training. Generally, this training is for anyone:

  • Who seeks to understand how machine learning works and how ML models can deliver actionable insights, decision support, and value to their organization.

  • Who wants to become more knowledgeable and proficient in identifying machine learning opportunities and in contributing to ML modeling applications.

  • Who seeks to learn the power of machine learning models in thought and action, in order to progress in your own career journey (e.g., from data analyst to data scientist).

  • Some experience with machine learning will make this workshop easier to follow, but all that is required is basic knowledge of the concepts.