Best Practices for Data Annotation at Scale

Enroll for Free

Course curriculum

The long-term success of machine learning relies on consistently labeled high-quality data. While most machine learning initiatives begin in the lab, they take on a life of their own and can create significant challenges once they scale. ML data ops practitioners can find themselves being consumed by the logistics of data annotation and management instead of focusing on the science. Wherever you are in your team’s machine learning journey, you must think about evolving towards large-scale production. Proactively planning a data management strategy can generate progressively better results, but it requires thought and stakeholder buy-in. A key ingredient of this journey is your data labeling and annotation framework. A data pipeline designed for human judgment and incremental training on edge cases provides that last mile of acceptability, enabling the machine learning solution to go to production. This session will reveal the implications of a live data loop in a production environment and how it significantly impacts the customer experience. Attendees will also takeaway trends and challenges in combining humans with the machine learning pipeline. In this session, iMerit's Jai Natarajan reveals best practices to build scalable and repeatable data labeling pipelines with a balance of tools and humans-in-the-loop. Through peer, manager, and machine-learning expert collaboration, data annotators refine their skills and master tasks well beyond the expertise of crowdsourcing. In a collaborative framework, annotators and ML experts negotiate and create meaning through an iterative feedback process as they identify new concepts and nuances in the data. Attendees will learn concepts like designing to break the ML, edge case knowledge management and workflow management.

1

Best Practices for Data Annotation at Scale
- Best Practices for Data Annotation at Scale

Instructor

Vice President, Strategic Business Development , iMerit

Jai Natarajan

Jai Natarajan is the Vice President, Strategic Business Development at iMerit, a global AI data solutions company delivering high-quality data that powers machine learning and artificial intelligence applications for Fortune 500 companies. Bringing more than 24 years of experience, Jai works with more than 5500 data experts who label and enrich data at scale to help customers get better results from their machine learning algorithms. Jai works with iMerit’s partner ecosystem to develop iMerit’s solutions for its customers, and provides strategic inputs to the company. Previously, Jai worked at Lucasfilm and Sony, and founded Xentrix, an Emmy-winning animation studio. He is a board member of the Anudip Foundation. JaI has an M.S. in Computer Science from UCLA, and undergraduate degrees from Birla Institute of Technology and Science.

ODSC APAC Virtual

Grab your Passes