Tutorial Overview

Tutorial duration: 50 min

Data literacy is fundamental to mastering many areas of AI including data science, machine learning, deep learning and data engineering. This fast paced but comprehensive tutorial will introduce you to fundamental data concept that will help you grasp underlying concepts and serve as a foundation course and prerequisite to more advanced machine learning and data science topics. Over 5 lessons you will learn the language of data and better understand the data life cycle and its role in building machine learning models.

DIFFICULTY LEVEL: INTERMEDIATE

Instructor Bio:

Software Engineer & AI Expert

Sheamus McGovern

Founder of ODSC and Software Architect specializing in, complex multi-platform systems across multiple industries including finance, healthcare, and education.

Tutorial Outline

Lesson 1 : An Introduction to Data Literacy and The Data Lifecycle

  • What is Data Literacy
  • Why Data Literacy is Important
  • Data Literacy for Careers in AI
  • The Data Life Cycle
  • Data Defined: What is Data?
  • Understanding Data Types


Lesson 2: Data Collection

  • Data Collection
  • Data Sourcing
  • External Data: Open and Non-Open Data
  • Licensed Data and Open Data
  • Data Collection Tools


Lesson 3 : Data Profiling

  • Data Profiling
  • Data Profiling: An Example Dataset
  • Describing a Dataset
  • Data Profiling: Correlations and Outliers
  • Discovering Data Issues
  • Data Profiling Tasks and Tools


Lesson 4: Data Preparation, Shaping, and Transformation

  • Data Preparation
  • Data Transformation
  • Data Enrichment
  • Data Shaping and Shaping Examples
  • Data Preparation and Shaping Tools


Lesson 5: Model Features

  • Features in Machine Learning Models
  • Feature Selection
  • Machine Learning Algorithms vs Models 
  • The Target Variable
  • Big Data & Wide Data



Background knowledge

  • No background knowledge necessary!