<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" encoding="UTF-8" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:atom="http://www.w3.org/2005/Atom/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:fireside="http://fireside.fm/modules/rss/fireside">
  <channel>
    <fireside:hostname>web01.fireside.fm</fireside:hostname>
    <fireside:genDate>Wed, 15 Apr 2026 11:46:33 -0500</fireside:genDate>
    <generator>Fireside (https://fireside.fm)</generator>
    <title>Vanishing Gradients - Episodes Tagged with “Rag”</title>
    <link>https://vanishinggradients.fireside.fm/tags/rag</link>
    <pubDate>Fri, 29 Aug 2025 21:00:00 +1000</pubDate>
    <description>A podcast about all things data, brought to you by data scientist Hugo Bowne-Anderson.
It's time for more critical conversations about the challenges in our industry in order to build better compasses for the solution space! To this end, this podcast will consist of long-format conversations between Hugo and other people who work broadly in the data science, machine learning, and AI spaces. We'll dive deep into all the moving parts of the data world, so if you're new to the space, you'll have an opportunity to learn from the experts. And if you've been around for a while, you'll find out what's happening in many other parts of the data world.
</description>
    <language>en-us</language>
    <itunes:type>episodic</itunes:type>
    <itunes:subtitle>a data podcast with hugo bowne-anderson</itunes:subtitle>
    <itunes:author>Hugo Bowne-Anderson</itunes:author>
    <itunes:summary>A podcast about all things data, brought to you by data scientist Hugo Bowne-Anderson.
It's time for more critical conversations about the challenges in our industry in order to build better compasses for the solution space! To this end, this podcast will consist of long-format conversations between Hugo and other people who work broadly in the data science, machine learning, and AI spaces. We'll dive deep into all the moving parts of the data world, so if you're new to the space, you'll have an opportunity to learn from the experts. And if you've been around for a while, you'll find out what's happening in many other parts of the data world.
</itunes:summary>
    <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
    <itunes:explicit>no</itunes:explicit>
    <itunes:keywords>data science, machine learning, AI</itunes:keywords>
    <itunes:owner>
      <itunes:name>Hugo Bowne-Anderson</itunes:name>
      <itunes:email>hugobowne@hey.com</itunes:email>
    </itunes:owner>
<itunes:category text="Technology"/>
<item>
  <title>Episode 57: AI Agents and LLM Judges at Scale: Processing Millions of Documents (Without Breaking the Bank)</title>
  <link>https://vanishinggradients.fireside.fm/57</link>
  <guid isPermaLink="false">60db26a1-cad5-4c3d-9661-bbc51a3a0b27</guid>
  <pubDate>Fri, 29 Aug 2025 21:00:00 +1000</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/60db26a1-cad5-4c3d-9661-bbc51a3a0b27.mp3" length="81037068" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>While many people talk about “agents,” **Shreya Shankar** (UC Berkeley) has been building the systems that make them reliable. In this episode, she shares how AI agents and LLM judges can be used to process millions of documents accurately and cheaply.  

Drawing from work on projects ranging from databases of police misconduct reports to large-scale customer transcripts, Shreya explains the frameworks, error analysis, and guardrails needed to turn flaky LLM outputs into trustworthy pipelines</itunes:subtitle>
  <itunes:duration>41:27</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>While many people talk about “agents,” Shreya Shankar (UC Berkeley) has been building the systems that make them reliable. In this episode, she shares how AI agents and LLM judges can be used to process millions of documents accurately and cheaply.  
Drawing from work on projects ranging from databases of police misconduct reports to large-scale customer transcripts, Shreya explains the frameworks, error analysis, and guardrails needed to turn flaky LLM outputs into trustworthy pipelines.  
We talk through:  
- Treating LLM workflows as ETL pipelines for unstructured text  
- Error analysis: why you need humans reviewing the first 50–100 traces  
- Guardrails like retries, validators, and “gleaning”  
- How LLM judges work — rubrics, pairwise comparisons, and cost trade-offs  
- Cheap vs. expensive models: when to swap for savings  
- Where agents fit in (and where they don’t)  
If you’ve ever wondered how to move beyond unreliable demos, this episode shows how to scale LLMs to millions of documents — without breaking the bank.
LINKS
Shreya's website (https://www.sh-reya.com/)
DocETL, A system for LLM-powered data processing (https://www.docetl.org/)
Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk)
Watch the podcast video on YouTube (https://youtu.be/3r_Hsjy85nk)
Shreya's AI evals course, which she teaches with Hamel "Evals" Husain (https://maven.com/parlance-labs/evals?promoCode=GOHUGORGOHOME)
🎓 Learn more:
Hugo's course: Building LLM Applications for Data Scientists and Software Engineers (https://maven.com/s/course/d56067f338) — https://maven.com/s/course/d56067f338 
</description>
  <itunes:keywords>LLMs, Agents, RAG, Machine Learning</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>While many people talk about “agents,” <strong>Shreya Shankar</strong> (UC Berkeley) has been building the systems that make them reliable. In this episode, she shares how AI agents and LLM judges can be used to process millions of documents accurately and cheaply.  </p>

<p>Drawing from work on projects ranging from databases of police misconduct reports to large-scale customer transcripts, Shreya explains the frameworks, error analysis, and guardrails needed to turn flaky LLM outputs into trustworthy pipelines.  </p>

<p><strong>We talk through:</strong>  </p>

<ul>
<li>Treating LLM workflows as ETL pipelines for unstructured text<br></li>
<li>Error analysis: why you need humans reviewing the first 50–100 traces<br></li>
<li>Guardrails like retries, validators, and “gleaning”<br></li>
<li>How LLM judges work — rubrics, pairwise comparisons, and cost trade-offs<br></li>
<li>Cheap vs. expensive models: when to swap for savings<br></li>
<li>Where agents fit in (and where they don’t)<br></li>
</ul>

<p>If you’ve ever wondered how to move beyond unreliable demos, this episode shows how to scale LLMs to millions of documents — without breaking the bank.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://www.sh-reya.com/" rel="nofollow">Shreya&#39;s website</a></li>
<li><a href="https://www.docetl.org/" rel="nofollow">DocETL, A system for LLM-powered data processing</a></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://youtu.be/3r_Hsjy85nk" rel="nofollow">Watch the podcast video on YouTube</a></li>
<li><a href="https://maven.com/parlance-labs/evals?promoCode=GOHUGORGOHOME" rel="nofollow">Shreya&#39;s AI evals course, which she teaches with Hamel &quot;Evals&quot; Husain</a></li>
</ul>

<p>🎓 Learn more:</p>

<ul>
<li><strong>Hugo&#39;s course:</strong> <a href="https://maven.com/s/course/d56067f338" rel="nofollow">Building LLM Applications for Data Scientists and Software Engineers</a> — <a href="https://maven.com/s/course/d56067f338" rel="nofollow">https://maven.com/s/course/d56067f338</a> </li>
</ul>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>While many people talk about “agents,” <strong>Shreya Shankar</strong> (UC Berkeley) has been building the systems that make them reliable. In this episode, she shares how AI agents and LLM judges can be used to process millions of documents accurately and cheaply.  </p>

<p>Drawing from work on projects ranging from databases of police misconduct reports to large-scale customer transcripts, Shreya explains the frameworks, error analysis, and guardrails needed to turn flaky LLM outputs into trustworthy pipelines.  </p>

<p><strong>We talk through:</strong>  </p>

<ul>
<li>Treating LLM workflows as ETL pipelines for unstructured text<br></li>
<li>Error analysis: why you need humans reviewing the first 50–100 traces<br></li>
<li>Guardrails like retries, validators, and “gleaning”<br></li>
<li>How LLM judges work — rubrics, pairwise comparisons, and cost trade-offs<br></li>
<li>Cheap vs. expensive models: when to swap for savings<br></li>
<li>Where agents fit in (and where they don’t)<br></li>
</ul>

<p>If you’ve ever wondered how to move beyond unreliable demos, this episode shows how to scale LLMs to millions of documents — without breaking the bank.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://www.sh-reya.com/" rel="nofollow">Shreya&#39;s website</a></li>
<li><a href="https://www.docetl.org/" rel="nofollow">DocETL, A system for LLM-powered data processing</a></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://youtu.be/3r_Hsjy85nk" rel="nofollow">Watch the podcast video on YouTube</a></li>
<li><a href="https://maven.com/parlance-labs/evals?promoCode=GOHUGORGOHOME" rel="nofollow">Shreya&#39;s AI evals course, which she teaches with Hamel &quot;Evals&quot; Husain</a></li>
</ul>

<p>🎓 Learn more:</p>

<ul>
<li><strong>Hugo&#39;s course:</strong> <a href="https://maven.com/s/course/d56067f338" rel="nofollow">Building LLM Applications for Data Scientists and Software Engineers</a> — <a href="https://maven.com/s/course/d56067f338" rel="nofollow">https://maven.com/s/course/d56067f338</a> </li>
</ul>]]>
  </itunes:summary>
</item>
  </channel>
</rss>
