Get this course for free with Premium Ai+ subscription
Description
Building proof-of-concept LLM/RAG apps is easy—we know that. The next step, which consumes the most time and is the most challenging, is bringing the app to a production-ready level. You must increase accuracy, reduce latency and costs, and create reproducible results.
You must start optimizing your LLM and RAG layers to ensure compliance with all these requirements. You must begin digging into open-source LLMs, fine-tuning LLMs for your specialized tasks, optimizing them for inference, and so on.
However, before optimizing anything, you must first determine what to optimize. Thus, you must quantify your system's key metrics (e.g., latency, costs, accuracy, recall, hallucinations, etc.).
Thus, as developing AI applications is an iterative process, the first critical step to getting to production is learning how to evaluate and monitor your LLM/RAG applications. The best strategy is to build something simple end-to-end, attach an evaluation layer on top of it, and then quickly iterate in the right direction by clearly indicating what needs improvement.
Thus, this workshop will focus on evaluating LLM/RAG apps. We will take a simple, predefined agentic RAG system built in LangGraph and understand how to evaluate and monitor it.
To do that, we will explore the following topics:
Add a prompt monitoring layer.
Visualize the quality of the embeddings.
Evaluate the context from the retrieval step used for RAG. Compute application-level metrics to expose hallucinations, moderation issues, and performance (using LLM-as-judges). Log the metrics to a prompt management tool to compare the experiments.
Instructor's Bio

Paul Iusztin
Senior AI Engineer / Founder at Decoding ML
Paul Iusztin is a senior AI/ML engineer with over seven years of experience building GenAI, Computer Vision and MLOps solutions. His latest contribution was at Metaphysic, where he was one of the core AI engineers who took large GPU-heavy models to production. He previously worked at CoreAI, Everseen, and Continental.
He is the co-author of the LLM Engineer's Handbook, a bestseller on Amazon, which presents a hands-on framework for building LLM applications.
Paul is the Founder of Decoding ML, an educational channel on GenAI and information retrieval that provides code, posts, articles, and courses teaching people to build production-ready AI systems that work. His contributions to the open-source community have sparked collaborations with industry leaders like MongoDB, Comet, Qdrant, ZenML and 11 other AI companies.
Webinar
-
1
Workshop "LLM & RAG Evaluation Playbook for Production Apps"
-
Ai+ Training
FREE PREVIEW -
Workshop recording
-
Additional information
-
Slides
-
Dozens of Free Courses with Premium
-
All Courses
ODSC 2025: 6-Week Winter AI Bootcamp
69 Lessons $499.00
-
All Courses
Agentic AI Summit 2025
38 Lessons $399.00
-
All Courses
ODSC East 2025 - All Recordings
61 Lessons $299.00
-
All Courses, RAG
ODSC AI Builders 2025 Summit - Mastering RAG
26 Lessons $299.00
-
All Courses
ODSC AI West 2025 - All Recordings
56 Lessons $299.00
-
All Courses
Deep Learning Bootcamp with Dr. Jon Krohn
7 Lessons $699.00