Description

AI Development Lifecycle: Learnings of What Changed with LLMs

When comparing the building of models/pipelines based on LLMs vs more traditional machine/deep learning, two observations are drawn:

- Building Proof of Concepts has become incredibly easy

- Evaluation is much more challenging

As a result, the evaluation step is often neglected leading to pointless iterations and a lack of knowledge on the true performance of the product. This is one of the main obstacle to moving into production, especially in circumstances where we need high accuracy of the results.

In this talk, we will explore the lessons learned from building products that are typical use cases of these technologies, and zoom on a specific use case: a RAG (Retrieval Augmented Generation) tool for a medical company.

Instructors Bio

Noé Achache

Engineering Manager & Generative AI Lead at Sicara

Noé is an Engineering Manager (for Data Science projects) at Sicara, where he worked on a wide range of projects mostly related to vector databases, computer vision, prediction with structured data and more recently LLMs. He is currently leading the GenAI development in the company. You can find all his talks and articles here: https://www.sicara.fr/en/noe-achache

Outline:

  • The importance of a good data methodology to build performant products based on LLMs and gain user adoption

  • Understand how a real world efficiency problem can be solved with LLMs (text to SQL, retrieval, and LLM generation)

  • Evaluation dataset best practices

  • When and how to use LLM as a judge vs manual evaluation

  • Why using a llm monitoring tool like Langfuse