Get this course for free with Premium AI+ subscription

Workshops:


This session focuses on practical methodologies for using small open-source language models (SLMs) in enterprise settings. We first highlight the limitations of proprietary models in terms of privacy, compliance, and cost.

Then, we explore modern workflows for adapting SLMs with domain-specific pre-training, instruction fine-tuning, and alignment. Along the way, we will introduce and demonstrate open-source tools like DistillKit, Spectrum, and MergeKit, which implement advanced techniques that are critical in achieving task-specific accuracy, optimizing computational costs, and building accurate and efficient agentic workflows. Join us to learn how smaller, efficient, and adaptable models can transform your AI applications."

In this hands-on virtual workshop, we’ll provide an introduction to fine-tuning, including use cases, best practices, and techniques for efficient fine-tuning with LoRA. You'll learn how to fine-tune task-specific models that beat GPT-4, dynamically serve multiple fine-tuned adapters on a single GPU with LoRA eXchange, and supercharge inference speed using Turbo LoRA speculative decoding, Predibase's proprietary optimization for faster and more cost-effective model serving.

There are many different approaches to fine-tuning. In this workshop, we will focus on Memory Tuning. This innovative approach combines two advanced techniques—LoRAs (Low-Rank Adaptation) and MoE (Mixture of Experts)—to create the Mixture of Memory Experts (MoME) model, pronounced “mommy.”

In this workshop, we will dive into the technical implementation details of the Mixture of Memory Experts (MoME) model that makes Memory Tuning computationally feasible at scale. We will step through a real-world text-to-SQL use case so you walk away with the knowledge to tune and evaluate your own models.

In this workshop we’ll cover the fundamentals of how to use evaluation driven development to build reliable applications with Large Language Models (LLMs). Building an AI application, RAG system, or agent involves many design choices. You have to choose between models, design prompts, expose tools to the model and build knowledge bases for RAG. If you don’t have a good evaluation in place then it’s likely you’ll waste a lot of time making changes but not actually improving performance. Post-deployment, evaluation is essential to ensure that changes don’t introduce regressions to your product.

Using real world case-studies, we’ll cover how to design your evaluators and use them as part of an iterative development process. We’ll cover code-based, LLM-as-judge and human evaluators.

Selecting the right AI model isn't a one-size-fits-all solution. It's a strategic process that demands careful evaluation, precise testing, and a methodical approach to finding your ideal technological match. Choosing incorrectly can decimate your return on investment, dramatically extend project timelines, and consume substantial staff resources in futile training attempts. In this hands-on workshop we’ll show you how to test and optimize models for your project.

Walk away with a strategy for finding the AI model that's not just good—but perfect for your specific needs.

This hands-on workshop equips you to solve real-world business problems with powerful LLMs. We'll compare frontier closed and open-source models, and discuss how to pick the right LLM for the task at hand. We’ll embark on an ambitious project to build an Agentic AI solution that beats frontier models.

Our Agents will include a frontier model, a proprietary QLoRA fine-tuned LLM, and a RAG pipeline. There’s a lot to build, but we will draw on magical open-source libraries from HuggingFace, Gradio and Chroma to get it all done in time -- and the results will be rather satisfying.

In this workshop, attendees will create a minimal NotebookLM clone using open weights models like Llama3 and Parler-TTS (text-to-speech). They will build and deploy their NotebookLM clones using Union Serverless, an end-to-end platform for building AI products, and will learn the building blocks, infrastructure, and abstractions that are helpful for building compound AI systems.
In this session, we'll run the hottest new model on the block -- DeepSeek-R1, the pile of floating point numbers that wiped about a trilliion dollars off the stock markets.

In addition to walking through a code sample that includes everything needed for deployment on a specific platform (Modal), we'll consider the general problem of serving LLM inference workloads, the available open source software, and the hardware that software controls.

Upcoming Events: