<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" encoding="UTF-8" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:atom="http://www.w3.org/2005/Atom/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:fireside="http://fireside.fm/modules/rss/fireside">
  <channel>
    <fireside:hostname>web02.fireside.fm</fireside:hostname>
    <fireside:genDate>Fri, 10 Apr 2026 23:42:03 -0500</fireside:genDate>
    <generator>Fireside (https://fireside.fm)</generator>
    <title>Vanishing Gradients - Episodes Tagged with “Ai”</title>
    <link>https://vanishinggradients.fireside.fm/tags/ai</link>
    <pubDate>Sat, 22 Nov 2025 18:30:00 +1100</pubDate>
    <description>A podcast about all things data, brought to you by data scientist Hugo Bowne-Anderson.
It's time for more critical conversations about the challenges in our industry in order to build better compasses for the solution space! To this end, this podcast will consist of long-format conversations between Hugo and other people who work broadly in the data science, machine learning, and AI spaces. We'll dive deep into all the moving parts of the data world, so if you're new to the space, you'll have an opportunity to learn from the experts. And if you've been around for a while, you'll find out what's happening in many other parts of the data world.
</description>
    <language>en-us</language>
    <itunes:type>episodic</itunes:type>
    <itunes:subtitle>a data podcast with hugo bowne-anderson</itunes:subtitle>
    <itunes:author>Hugo Bowne-Anderson</itunes:author>
    <itunes:summary>A podcast about all things data, brought to you by data scientist Hugo Bowne-Anderson.
It's time for more critical conversations about the challenges in our industry in order to build better compasses for the solution space! To this end, this podcast will consist of long-format conversations between Hugo and other people who work broadly in the data science, machine learning, and AI spaces. We'll dive deep into all the moving parts of the data world, so if you're new to the space, you'll have an opportunity to learn from the experts. And if you've been around for a while, you'll find out what's happening in many other parts of the data world.
</itunes:summary>
    <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
    <itunes:explicit>no</itunes:explicit>
    <itunes:keywords>data science, machine learning, AI</itunes:keywords>
    <itunes:owner>
      <itunes:name>Hugo Bowne-Anderson</itunes:name>
      <itunes:email>hugobowne@hey.com</itunes:email>
    </itunes:owner>
<itunes:category text="Technology"/>
<item>
  <title>Episode 63: Why Gemini 3 Will Change How You Build AI Agents with Ravin Kumar (Google DeepMind)</title>
  <link>https://vanishinggradients.fireside.fm/63</link>
  <guid isPermaLink="false">cc45813e-f7ec-434d-a816-e7a5dfba8946</guid>
  <pubDate>Sat, 22 Nov 2025 18:30:00 +1100</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/cc45813e-f7ec-434d-a816-e7a5dfba8946.mp3" length="117720514" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>Gemini 3 is a few days old and the massive leap in performance and model reasoning has big implications for builders: as models begin to self-heal, builders are literally tearing out the functionality their built just months ago: ripping out the defensive coding and reshipping their agent harnesses entirely.

Ravin Kumar (Google DeepMind) joins Hugo to breaks down exactly why the rapid evolution of models like Gemini 3 is changing how we build software. They detail the shift from simple tool calling to building reliable "Agent Harnesses", explore the architectural tradeoffs between deterministic workflows and high-agency systems, the nuance of preventing context rot in massive windows, and why proper evaluation infrastructure is the only way to manage the chaos of autonomous loops.</itunes:subtitle>
  <itunes:duration>1:00:12</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>Gemini 3 is a few days old and the massive leap in performance and model reasoning has big implications for builders: as models begin to self-heal, builders are literally tearing out the functionality they built just months ago... ripping out the defensive coding and reshipping their agent harnesses entirely.
Ravin Kumar (Google DeepMind) joins Hugo to breaks down exactly why the rapid evolution of models like Gemini 3 is changing how we build software. They detail the shift from simple tool calling to building reliable "Agent Harnesses", explore the architectural tradeoffs between deterministic workflows and high-agency systems, the nuance of preventing context rot in massive windows, and why proper evaluation infrastructure is the only way to manage the chaos of autonomous loops.
They talk through:
- The implications of models that can "self-heal" and fix their own code
- The two cultures of agents: LLM workflows with a few tools versus when you should unleash high-agency, autonomous systems.
- Inside NotebookLM: moving from prototypes to viral production features like Audio Overviews
- Why Needle in a Haystack benchmarks often fail to predict real-world performance
- How to build agent harnesses that turn model capabilities into product velocity
- The shift from measuring latency to managing time-to-compute for reasoning tasks
LINKS
From Context Engineering to AI Agent Harnesses: The New Software Discipline, a podcast Hugo did with Lance Martin, LangChain (https://high-signal.delphina.ai/episode/context-engineering-to-ai-agent-harnesses-the-new-software-discipline)
Context Rot: How Increasing Input Tokens Impacts LLM Performance (https://research.trychroma.com/context-rot)
Effective context engineering for AI agents by Anthropic (https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents)
Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk)
Watch the podcast video on YouTube (https://youtu.be/CloimQsQuJM)
Join the final cohort of our Building AI Applications course starting March 10, 2026 (25% off for listeners) (https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs): https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs 
</description>
  <itunes:keywords>ai, genai, machine learning</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>Gemini 3 is a few days old and the massive leap in performance and model reasoning has big implications for builders: as models begin to self-heal, builders are literally tearing out the functionality they built just months ago... ripping out the defensive coding and reshipping their agent harnesses entirely.</p>

<p>Ravin Kumar (Google DeepMind) joins Hugo to breaks down exactly why the rapid evolution of models like Gemini 3 is changing how we build software. They detail the shift from simple tool calling to building reliable &quot;Agent Harnesses&quot;, explore the architectural tradeoffs between deterministic workflows and high-agency systems, the nuance of preventing context rot in massive windows, and why proper evaluation infrastructure is the only way to manage the chaos of autonomous loops.</p>

<p>They talk through:</p>

<ul>
<li>The implications of models that can &quot;self-heal&quot; and fix their own code</li>
<li>The two cultures of agents: LLM workflows with a few tools versus when you should unleash high-agency, autonomous systems.</li>
<li>Inside NotebookLM: moving from prototypes to viral production features like Audio Overviews</li>
<li>Why Needle in a Haystack benchmarks often fail to predict real-world performance</li>
<li>How to build agent harnesses that turn model capabilities into product velocity</li>
<li>The shift from measuring latency to managing time-to-compute for reasoning tasks</li>
</ul>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://high-signal.delphina.ai/episode/context-engineering-to-ai-agent-harnesses-the-new-software-discipline" rel="nofollow">From Context Engineering to AI Agent Harnesses: The New Software Discipline, a podcast Hugo did with Lance Martin, LangChain</a></li>
<li><a href="https://research.trychroma.com/context-rot" rel="nofollow">Context Rot: How Increasing Input Tokens Impacts LLM Performance</a></li>
<li><a href="https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents" rel="nofollow">Effective context engineering for AI agents by Anthropic</a></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://youtu.be/CloimQsQuJM" rel="nofollow">Watch the podcast video on YouTube</a></li>
</ul>

<p><a href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs" rel="nofollow">Join the final cohort of our Building AI Applications course starting March 10, 2026 (25% off for listeners)</a>: <a href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs" rel="nofollow">https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs</a></p>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>Gemini 3 is a few days old and the massive leap in performance and model reasoning has big implications for builders: as models begin to self-heal, builders are literally tearing out the functionality they built just months ago... ripping out the defensive coding and reshipping their agent harnesses entirely.</p>

<p>Ravin Kumar (Google DeepMind) joins Hugo to breaks down exactly why the rapid evolution of models like Gemini 3 is changing how we build software. They detail the shift from simple tool calling to building reliable &quot;Agent Harnesses&quot;, explore the architectural tradeoffs between deterministic workflows and high-agency systems, the nuance of preventing context rot in massive windows, and why proper evaluation infrastructure is the only way to manage the chaos of autonomous loops.</p>

<p>They talk through:</p>

<ul>
<li>The implications of models that can &quot;self-heal&quot; and fix their own code</li>
<li>The two cultures of agents: LLM workflows with a few tools versus when you should unleash high-agency, autonomous systems.</li>
<li>Inside NotebookLM: moving from prototypes to viral production features like Audio Overviews</li>
<li>Why Needle in a Haystack benchmarks often fail to predict real-world performance</li>
<li>How to build agent harnesses that turn model capabilities into product velocity</li>
<li>The shift from measuring latency to managing time-to-compute for reasoning tasks</li>
</ul>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://high-signal.delphina.ai/episode/context-engineering-to-ai-agent-harnesses-the-new-software-discipline" rel="nofollow">From Context Engineering to AI Agent Harnesses: The New Software Discipline, a podcast Hugo did with Lance Martin, LangChain</a></li>
<li><a href="https://research.trychroma.com/context-rot" rel="nofollow">Context Rot: How Increasing Input Tokens Impacts LLM Performance</a></li>
<li><a href="https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents" rel="nofollow">Effective context engineering for AI agents by Anthropic</a></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://youtu.be/CloimQsQuJM" rel="nofollow">Watch the podcast video on YouTube</a></li>
</ul>

<p><a href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs" rel="nofollow">Join the final cohort of our Building AI Applications course starting March 10, 2026 (25% off for listeners)</a>: <a href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs" rel="nofollow">https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs</a></p>]]>
  </itunes:summary>
</item>
<item>
  <title>Episode 62: Practical AI at Work: How Execs and Developers Can Actually Use LLMs </title>
  <link>https://vanishinggradients.fireside.fm/62</link>
  <guid isPermaLink="false">e1d21cdd-f714-4910-9696-60086f5feb62</guid>
  <pubDate>Fri, 31 Oct 2025 18:00:00 +1100</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/e1d21cdd-f714-4910-9696-60086f5feb62.mp3" length="85069031" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>Many leaders are trapped between chasing ambitious, ill-defined AI projects and the paralysis of not knowing where to start. Dr. Randall Olson argues that the real opportunity isn't in moonshots, but in the "trillions of dollars of business value" available right now. As co-founder of Wyrd Studios, he bridges the gap between data science, AI engineering, and executive strategy to deliver a practical framework for execution.
</itunes:subtitle>
  <itunes:duration>59:04</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>Many leaders are trapped between chasing ambitious, ill-defined AI projects and the paralysis of not knowing where to start. Dr. Randall Olson argues that the real opportunity isn't in moonshots, but in the "trillions of dollars of business value" available right now. As co-founder of Wyrd Studios, he bridges the gap between data science, AI engineering, and executive strategy to deliver a practical framework for execution.
In this episode, Randy and Hugo lay out how to find and solve what might be considered "boring but valuable" problems, like an EdTech company automating 20% of its support tickets with a simple retrieval bot instead of a complex AI tutor. They discuss how to move incrementally along the "agentic spectrum" and why treating AI evaluation with the same rigor as software engineering is non-negotiable for building a disciplined, high-impact AI strategy.
They talk through:
How a non-technical leader can prototype a complex insurance claim classifier using just photos and a ChatGPT subscription.
The agentic spectrum: Why you should start by automating meeting summaries before attempting to build fully autonomous agents.
The practical first step for any executive: Building a personal knowledge base with meeting transcripts and strategy docs to get tailored AI advice.
Why treating AI evaluation with the same rigor as unit testing is essential for shipping reliable products.
The organizational shift required to unlock long-term AI gains, even if it means a short-term productivity dip.
LINKS
Randy on LinkedIn (https://www.zenml.io/llmops-database)
Wyrd Studios (https://thewyrdstudios.com/)
Stop Building AI Agents (https://www.decodingai.com/p/stop-building-ai-agents)
Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk)
Watch the podcast video on YouTube (https://youtu.be/-YQjKH3wRvc)
🎓 Learn more:
Join the final cohort of our Building AI Applications course starting March 10, 2026 (25% off for listeners) (https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs): https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs
Next cohort starts November 3: come build with us! 
</description>
  <itunes:keywords>ai, agents, machine learning, data science</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>Many leaders are trapped between chasing ambitious, ill-defined AI projects and the paralysis of not knowing where to start. Dr. Randall Olson argues that the real opportunity isn&#39;t in moonshots, but in the &quot;trillions of dollars of business value&quot; available right now. As co-founder of Wyrd Studios, he bridges the gap between data science, AI engineering, and executive strategy to deliver a practical framework for execution.</p>

<p>In this episode, Randy and Hugo lay out how to find and solve what might be considered &quot;boring but valuable&quot; problems, like an EdTech company automating 20% of its support tickets with a simple retrieval bot instead of a complex AI tutor. They discuss how to move incrementally along the &quot;agentic spectrum&quot; and why treating AI evaluation with the same rigor as software engineering is non-negotiable for building a disciplined, high-impact AI strategy.</p>

<p>They talk through:</p>

<ul>
<li>How a non-technical leader can prototype a complex insurance claim classifier using just photos and a ChatGPT subscription.</li>
<li>The agentic spectrum: Why you should start by automating meeting summaries before attempting to build fully autonomous agents.</li>
<li>The practical first step for any executive: Building a personal knowledge base with meeting transcripts and strategy docs to get tailored AI advice.</li>
<li>Why treating AI evaluation with the same rigor as unit testing is essential for shipping reliable products.</li>
<li>The organizational shift required to unlock long-term AI gains, even if it means a short-term productivity dip.</li>
</ul>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://www.zenml.io/llmops-database" rel="nofollow">Randy on LinkedIn</a></li>
<li><a href="https://thewyrdstudios.com/" rel="nofollow">Wyrd Studios</a></li>
<li><a href="https://www.decodingai.com/p/stop-building-ai-agents" rel="nofollow">Stop Building AI Agents</a></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://youtu.be/-YQjKH3wRvc" rel="nofollow">Watch the podcast video on YouTube</a></li>
</ul>

<p>🎓 Learn more:</p>

<p><a href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs" rel="nofollow">Join the final cohort of our Building AI Applications course starting March 10, 2026 (25% off for listeners)</a>: <a href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs" rel="nofollow">https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs</a></p>

<p>Next cohort starts November 3: come build with us!</p>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>Many leaders are trapped between chasing ambitious, ill-defined AI projects and the paralysis of not knowing where to start. Dr. Randall Olson argues that the real opportunity isn&#39;t in moonshots, but in the &quot;trillions of dollars of business value&quot; available right now. As co-founder of Wyrd Studios, he bridges the gap between data science, AI engineering, and executive strategy to deliver a practical framework for execution.</p>

<p>In this episode, Randy and Hugo lay out how to find and solve what might be considered &quot;boring but valuable&quot; problems, like an EdTech company automating 20% of its support tickets with a simple retrieval bot instead of a complex AI tutor. They discuss how to move incrementally along the &quot;agentic spectrum&quot; and why treating AI evaluation with the same rigor as software engineering is non-negotiable for building a disciplined, high-impact AI strategy.</p>

<p>They talk through:</p>

<ul>
<li>How a non-technical leader can prototype a complex insurance claim classifier using just photos and a ChatGPT subscription.</li>
<li>The agentic spectrum: Why you should start by automating meeting summaries before attempting to build fully autonomous agents.</li>
<li>The practical first step for any executive: Building a personal knowledge base with meeting transcripts and strategy docs to get tailored AI advice.</li>
<li>Why treating AI evaluation with the same rigor as unit testing is essential for shipping reliable products.</li>
<li>The organizational shift required to unlock long-term AI gains, even if it means a short-term productivity dip.</li>
</ul>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://www.zenml.io/llmops-database" rel="nofollow">Randy on LinkedIn</a></li>
<li><a href="https://thewyrdstudios.com/" rel="nofollow">Wyrd Studios</a></li>
<li><a href="https://www.decodingai.com/p/stop-building-ai-agents" rel="nofollow">Stop Building AI Agents</a></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://youtu.be/-YQjKH3wRvc" rel="nofollow">Watch the podcast video on YouTube</a></li>
</ul>

<p>🎓 Learn more:</p>

<p><a href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs" rel="nofollow">Join the final cohort of our Building AI Applications course starting March 10, 2026 (25% off for listeners)</a>: <a href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs" rel="nofollow">https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs</a></p>

<p>Next cohort starts November 3: come build with us!</p>]]>
  </itunes:summary>
</item>
<item>
  <title>Episode 61: The AI Agent Reliability Cliff: What Happens When Tools Fail in Production</title>
  <link>https://vanishinggradients.fireside.fm/61</link>
  <guid isPermaLink="false">66d8da7e-5291-4273-8a87-c956fdf2f784</guid>
  <pubDate>Thu, 16 Oct 2025 14:00:00 +1100</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/66d8da7e-5291-4273-8a87-c956fdf2f784.mp3" length="55333020" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>Most AI teams find their multi-agent systems devolving into chaos, but ML Engineer Alex Strick van Linschoten argues they are ignoring the production reality. In this episode, he draws on insights from the LLM Ops Database (750+ real-world deployments then; now nearly 1,000!) to systematically measure and engineer constraint, turning unreliable prototypes into robust, enterprise-ready AI.</itunes:subtitle>
  <itunes:duration>28:04</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>Most AI teams find their multi-agent systems devolving into chaos, but ML Engineer Alex Strick van Linschoten argues they are ignoring the production reality. In this episode, he draws on insights from the LLM Ops Database (750+ real-world deployments then; now nearly 1,000!) to systematically measure and engineer constraint, turning unreliable prototypes into robust, enterprise-ready AI.
Drawing from his work at Zen ML, Alex details why success requires scaling down and enforcing MLOps discipline to navigate the unpredictable "Agent Reliability Cliff". He provides the essential architectural shifts, evaluation hygiene techniques, and practical steps needed to move beyond guesswork and build scalable, trustworthy AI products.
We talk through:
- Why "shoving a thousand agents" into an app is the fastest route to unmanageable chaos
- The essential MLOps hygiene (tracing and continuous evals) that most teams skip
- The optimal (and very low) limit for the number of tools an agent can reliably use
- How to use human-in-the-loop strategies to manage the risk of autonomous failure in high-sensitivity domains
- The principle of using simple Python/RegEx before resorting to costly LLM judges
LINKS
The LLMOps Database: 925 entries as of today....submit a use case to help it get to 1K! (https://www.zenml.io/llmops-database)
Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk)
Watch the podcast video on YouTube (https://youtu.be/-YQjKH3wRvc)
🎓 Learn more:
Join the final cohort of our Building AI Applications course starting March 10, 2026 (25% off for listeners) (https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs): https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs 
</description>
  <itunes:keywords>ai, agents, mlops, machine learning</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>Most AI teams find their multi-agent systems devolving into chaos, but ML Engineer Alex Strick van Linschoten argues they are ignoring the production reality. In this episode, he draws on insights from the LLM Ops Database (750+ real-world deployments then; now nearly 1,000!) to systematically measure and engineer constraint, turning unreliable prototypes into robust, enterprise-ready AI.</p>

<p>Drawing from his work at Zen ML, Alex details why success requires scaling down and enforcing MLOps discipline to navigate the unpredictable &quot;Agent Reliability Cliff&quot;. He provides the essential architectural shifts, evaluation hygiene techniques, and practical steps needed to move beyond guesswork and build scalable, trustworthy AI products.</p>

<p>We talk through:</p>

<ul>
<li>Why &quot;shoving a thousand agents&quot; into an app is the fastest route to unmanageable chaos</li>
<li>The essential MLOps hygiene (tracing and continuous evals) that most teams skip</li>
<li>The optimal (and very low) limit for the number of tools an agent can reliably use</li>
<li>How to use human-in-the-loop strategies to manage the risk of autonomous failure in high-sensitivity domains</li>
<li>The principle of using simple Python/RegEx before resorting to costly LLM judges</li>
</ul>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://www.zenml.io/llmops-database" rel="nofollow">The LLMOps Database: 925 entries as of today....submit a use case to help it get to 1K!</a></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://youtu.be/-YQjKH3wRvc" rel="nofollow">Watch the podcast video on YouTube</a></li>
</ul>

<p>🎓 Learn more:</p>

<p><a href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs" rel="nofollow">Join the final cohort of our Building AI Applications course starting March 10, 2026 (25% off for listeners)</a>: <a href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs" rel="nofollow">https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs</a></p>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>Most AI teams find their multi-agent systems devolving into chaos, but ML Engineer Alex Strick van Linschoten argues they are ignoring the production reality. In this episode, he draws on insights from the LLM Ops Database (750+ real-world deployments then; now nearly 1,000!) to systematically measure and engineer constraint, turning unreliable prototypes into robust, enterprise-ready AI.</p>

<p>Drawing from his work at Zen ML, Alex details why success requires scaling down and enforcing MLOps discipline to navigate the unpredictable &quot;Agent Reliability Cliff&quot;. He provides the essential architectural shifts, evaluation hygiene techniques, and practical steps needed to move beyond guesswork and build scalable, trustworthy AI products.</p>

<p>We talk through:</p>

<ul>
<li>Why &quot;shoving a thousand agents&quot; into an app is the fastest route to unmanageable chaos</li>
<li>The essential MLOps hygiene (tracing and continuous evals) that most teams skip</li>
<li>The optimal (and very low) limit for the number of tools an agent can reliably use</li>
<li>How to use human-in-the-loop strategies to manage the risk of autonomous failure in high-sensitivity domains</li>
<li>The principle of using simple Python/RegEx before resorting to costly LLM judges</li>
</ul>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://www.zenml.io/llmops-database" rel="nofollow">The LLMOps Database: 925 entries as of today....submit a use case to help it get to 1K!</a></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://youtu.be/-YQjKH3wRvc" rel="nofollow">Watch the podcast video on YouTube</a></li>
</ul>

<p>🎓 Learn more:</p>

<p><a href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs" rel="nofollow">Join the final cohort of our Building AI Applications course starting March 10, 2026 (25% off for listeners)</a>: <a href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs" rel="nofollow">https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs</a></p>]]>
  </itunes:summary>
</item>
<item>
  <title>Episode 60: 10 Things I Hate About AI Evals with Hamel Husain</title>
  <link>https://vanishinggradients.fireside.fm/60</link>
  <guid isPermaLink="false">0fbc2a65-3bfc-4f8a-83ac-d370f1a30e13</guid>
  <pubDate>Tue, 30 Sep 2025 17:30:00 +1000</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/0fbc2a65-3bfc-4f8a-83ac-d370f1a30e13.mp3" length="105505355" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>Most AI teams find "evals" frustrating, but ML Engineer Hamel Husain argues they’re just using the wrong playbook. In this episode, he lays out a data-centric approach to systematically measure and improve AI, turning unreliable prototypes into robust, production-ready systems.
</itunes:subtitle>
  <itunes:duration>1:13:15</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>Most AI teams find "evals" frustrating, but ML Engineer Hamel Husain argues they’re just using the wrong playbook. In this episode, he lays out a data-centric approach to systematically measure and improve AI, turning unreliable prototypes into robust, production-ready systems.
Drawing from his experience getting countless teams unstuck, Hamel explains why the solution requires a "revenge of the data scientists." He details the essential mindset shifts, error analysis techniques, and practical steps needed to move beyond guesswork and build AI products you can actually trust.
We talk through:
  The 10(+1) critical mistakes that cause teams to waste time on evals
  Why "hallucination scores" are a waste of time (and what to measure instead)
  The manual review process that finds major issues in hours, not weeks
  A step-by-step method for building LLM judges you can actually trust
  How to use domain experts without getting stuck in endless review committees
  Guest Bryan Bischof's "Failure as a Funnel" for debugging complex AI agents
If you're tired of ambiguous "vibe checks" and want a clear process that delivers real improvement, this episode provides the definitive roadmap.
LINKS
Hamel's website and blog (https://hamel.dev/)
Hugo speaks with Philip Carter (Honeycomb) about aligning your LLM-as-a-judge with your domain expertise (https://vanishinggradients.fireside.fm/51)
Hamel Husain on Lenny's pocast, which includes a live demo of error analysis (https://www.lennysnewsletter.com/p/why-ai-evals-are-the-hottest-new-skill)
The episode of VG in which Hamel and Hugo talk about Hamel's "data consulting in Vegas" era (https://vanishinggradients.fireside.fm/9)
Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk)
Watch the podcast video on YouTube (https://youtube.com/live/QEk-XwrkqhI?feature=share)
Hamel's AI evals course, which he teaches with Shreya Shankar (UC Berkeley): starts Oct 6 and this link gives 35% off! (https://maven.com/parlance-labs/evals?promoCode=GOHUGORGOHOME) https://maven.com/parlance-labs/evals?promoCode=GOHUGORGOHOME
🎓 Learn more:
Hugo's course: Building LLM Applications for Data Scientists and Software Engineers (https://maven.com/s/course/d56067f338) — https://maven.com/s/course/d56067f338  
</description>
  <itunes:keywords>AI, GenAI, LLMs, data science, machine learning, evals</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>Most AI teams find &quot;evals&quot; frustrating, but ML Engineer Hamel Husain argues they’re just using the wrong playbook. In this episode, he lays out a data-centric approach to systematically measure and improve AI, turning unreliable prototypes into robust, production-ready systems.</p>

<p>Drawing from his experience getting countless teams unstuck, Hamel explains why the solution requires a &quot;revenge of the data scientists.&quot; He details the essential mindset shifts, error analysis techniques, and practical steps needed to move beyond guesswork and build AI products you can actually trust.</p>

<p>We talk through:</p>

<ul>
<li>  The 10(+1) critical mistakes that cause teams to waste time on evals</li>
<li>  Why &quot;hallucination scores&quot; are a waste of time (and what to measure instead)</li>
<li>  The manual review process that finds major issues in hours, not weeks</li>
<li>  A step-by-step method for building LLM judges you can actually trust</li>
<li>  How to use domain experts without getting stuck in endless review committees</li>
<li>  Guest Bryan Bischof&#39;s &quot;Failure as a Funnel&quot; for debugging complex AI agents</li>
</ul>

<p>If you&#39;re tired of ambiguous &quot;vibe checks&quot; and want a clear process that delivers real improvement, this episode provides the definitive roadmap.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://hamel.dev/" rel="nofollow">Hamel&#39;s website and blog</a></li>
<li><a href="https://vanishinggradients.fireside.fm/51" rel="nofollow">Hugo speaks with Philip Carter (Honeycomb) about aligning your LLM-as-a-judge with your domain expertise</a></li>
<li><a href="https://www.lennysnewsletter.com/p/why-ai-evals-are-the-hottest-new-skill" rel="nofollow">Hamel Husain on Lenny&#39;s pocast, which includes a live demo of error analysis</a></li>
<li><a href="https://vanishinggradients.fireside.fm/9" rel="nofollow">The episode of VG in which Hamel and Hugo talk about Hamel&#39;s &quot;data consulting in Vegas&quot; era</a></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://youtube.com/live/QEk-XwrkqhI?feature=share" rel="nofollow">Watch the podcast video on YouTube</a></li>
<li><a href="https://maven.com/parlance-labs/evals?promoCode=GOHUGORGOHOME" rel="nofollow">Hamel&#39;s AI evals course, which he teaches with Shreya Shankar (UC Berkeley): starts Oct 6 and this link gives 35% off!</a> <a href="https://maven.com/parlance-labs/evals?promoCode=GOHUGORGOHOME" rel="nofollow">https://maven.com/parlance-labs/evals?promoCode=GOHUGORGOHOME</a></li>
</ul>

<p>🎓 Learn more:</p>

<ul>
<li><strong>Hugo&#39;s course:</strong> <a href="https://maven.com/s/course/d56067f338" rel="nofollow">Building LLM Applications for Data Scientists and Software Engineers</a> — <a href="https://maven.com/s/course/d56067f338" rel="nofollow">https://maven.com/s/course/d56067f338</a> </li>
</ul>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>Most AI teams find &quot;evals&quot; frustrating, but ML Engineer Hamel Husain argues they’re just using the wrong playbook. In this episode, he lays out a data-centric approach to systematically measure and improve AI, turning unreliable prototypes into robust, production-ready systems.</p>

<p>Drawing from his experience getting countless teams unstuck, Hamel explains why the solution requires a &quot;revenge of the data scientists.&quot; He details the essential mindset shifts, error analysis techniques, and practical steps needed to move beyond guesswork and build AI products you can actually trust.</p>

<p>We talk through:</p>

<ul>
<li>  The 10(+1) critical mistakes that cause teams to waste time on evals</li>
<li>  Why &quot;hallucination scores&quot; are a waste of time (and what to measure instead)</li>
<li>  The manual review process that finds major issues in hours, not weeks</li>
<li>  A step-by-step method for building LLM judges you can actually trust</li>
<li>  How to use domain experts without getting stuck in endless review committees</li>
<li>  Guest Bryan Bischof&#39;s &quot;Failure as a Funnel&quot; for debugging complex AI agents</li>
</ul>

<p>If you&#39;re tired of ambiguous &quot;vibe checks&quot; and want a clear process that delivers real improvement, this episode provides the definitive roadmap.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://hamel.dev/" rel="nofollow">Hamel&#39;s website and blog</a></li>
<li><a href="https://vanishinggradients.fireside.fm/51" rel="nofollow">Hugo speaks with Philip Carter (Honeycomb) about aligning your LLM-as-a-judge with your domain expertise</a></li>
<li><a href="https://www.lennysnewsletter.com/p/why-ai-evals-are-the-hottest-new-skill" rel="nofollow">Hamel Husain on Lenny&#39;s pocast, which includes a live demo of error analysis</a></li>
<li><a href="https://vanishinggradients.fireside.fm/9" rel="nofollow">The episode of VG in which Hamel and Hugo talk about Hamel&#39;s &quot;data consulting in Vegas&quot; era</a></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://youtube.com/live/QEk-XwrkqhI?feature=share" rel="nofollow">Watch the podcast video on YouTube</a></li>
<li><a href="https://maven.com/parlance-labs/evals?promoCode=GOHUGORGOHOME" rel="nofollow">Hamel&#39;s AI evals course, which he teaches with Shreya Shankar (UC Berkeley): starts Oct 6 and this link gives 35% off!</a> <a href="https://maven.com/parlance-labs/evals?promoCode=GOHUGORGOHOME" rel="nofollow">https://maven.com/parlance-labs/evals?promoCode=GOHUGORGOHOME</a></li>
</ul>

<p>🎓 Learn more:</p>

<ul>
<li><strong>Hugo&#39;s course:</strong> <a href="https://maven.com/s/course/d56067f338" rel="nofollow">Building LLM Applications for Data Scientists and Software Engineers</a> — <a href="https://maven.com/s/course/d56067f338" rel="nofollow">https://maven.com/s/course/d56067f338</a> </li>
</ul>]]>
  </itunes:summary>
</item>
<item>
  <title>Episode 59: Patterns and Anti-Patterns For Building with AI</title>
  <link>https://vanishinggradients.fireside.fm/59</link>
  <guid isPermaLink="false">7f895ff2-65c7-47f3-8c9f-0e632e93e31f</guid>
  <pubDate>Wed, 24 Sep 2025 09:30:00 +1000</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/7f895ff2-65c7-47f3-8c9f-0e632e93e31f.mp3" length="93434800" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>John Berryman has seen what works — and what breaks — when building AI applications. In this episode, he shares the “seven deadly sins” of LLM development and the fixes that keep projects from falling apart. From context management to retrieval debugging, John explains the patterns he’s seen succeed, the mistakes to avoid, and why it helps to think of an LLM as an “AI intern” rather than an all-knowing oracle.  </itunes:subtitle>
  <itunes:duration>47:37</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>John Berryman (Arcturus Labs; early GitHub Copilot engineer; co-author of Relevant Search and Prompt Engineering for LLMs) has spent years figuring out what makes AI applications actually work in production. In this episode, he shares the “seven deadly sins” of LLM development — and the practical fixes that keep projects from stalling.  
From context management to retrieval debugging, John explains the patterns he’s seen succeed, the mistakes to avoid, and why it helps to think of an LLM as an “AI intern” rather than an all-knowing oracle.  
We talk through:  
- Why chasing perfect accuracy is a dead end  
- How to use agents without losing control  
- Context engineering: fitting the right information in the window  
- Starting simple instead of over-orchestrating  
- Separating retrieval from generation in RAG  
- Splitting complex extractions into smaller checks  
- Knowing when frameworks help — and when they slow you down  
A practical guide to avoiding the common traps of LLM development and building systems that actually hold up in production.
LINKS:
Context Engineering for AI Agents, a free, upcoming lightning lesson from John and Hugo (https://maven.com/p/4485aa/context-engineering-for-ai-agents)
The Hidden Simplicity of GenAI Systems, a previous lightning lesson from John and Hugo (https://maven.com/p/a8195d/the-hidden-simplicity-of-gen-ai-systems)
Roaming RAG – RAG without the Vector Database, by John (https://arcturus-labs.com/blog/2024/11/21/roaming-rag--rag-without-the-vector-database/)
Cut the Chit-Chat with Artifacts, by John (https://arcturus-labs.com/blog/2024/11/11/cut-the-chit-chat-with-artifacts/)
Prompt Engineering for LLMs by John and Albert Ziegler (https://amzn.to/4gChsFf)
Relevant Search by John and Doug Turnbull (https://amzn.to/3TXmDHk)
Arcturus Labs (https://arcturus-labs.com/)
Watch the podcast on YouTube (https://youtu.be/mKTQGKIUq8M)
Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk)
🎓 Learn more:
Hugo's course (this episode was a guest Q&amp;amp;A from the course): Building LLM Applications for Data Scientists and Software Engineers (https://maven.com/s/course/d56067f338) — https://maven.com/s/course/d56067f338  
</description>
  <itunes:keywords>AI, machine learning, data science</itunes:keywords>
  <content:encoded>
    <![CDATA[<p><strong>John Berryman</strong> (Arcturus Labs; early GitHub Copilot engineer; co-author of <em>Relevant Search</em> and <em>Prompt Engineering for LLMs</em>) has spent years figuring out what makes AI applications actually work in production. In this episode, he shares the “seven deadly sins” of LLM development — and the practical fixes that keep projects from stalling.  </p>

<p>From context management to retrieval debugging, John explains the patterns he’s seen succeed, the mistakes to avoid, and why it helps to think of an LLM as an “AI intern” rather than an all-knowing oracle.  </p>

<p><strong>We talk through:</strong>  </p>

<ul>
<li>Why chasing perfect accuracy is a dead end<br></li>
<li>How to use agents without losing control<br></li>
<li>Context engineering: fitting the right information in the window<br></li>
<li>Starting simple instead of over-orchestrating<br></li>
<li>Separating retrieval from generation in RAG<br></li>
<li>Splitting complex extractions into smaller checks<br></li>
<li>Knowing when frameworks help — and when they slow you down<br></li>
</ul>

<p>A practical guide to avoiding the common traps of LLM development and building systems that actually hold up in production.</p>

<p><strong>LINKS:</strong></p>

<ul>
<li><a href="https://maven.com/p/4485aa/context-engineering-for-ai-agents" rel="nofollow">Context Engineering for AI Agents, a free, upcoming lightning lesson from John and Hugo</a></li>
<li><a href="https://maven.com/p/a8195d/the-hidden-simplicity-of-gen-ai-systems" rel="nofollow">The Hidden Simplicity of GenAI Systems, a previous lightning lesson from John and Hugo</a></li>
<li><a href="https://arcturus-labs.com/blog/2024/11/21/roaming-rag--rag-without-the-vector-database/" rel="nofollow">Roaming RAG – RAG without the Vector Database, by John</a></li>
<li><a href="https://arcturus-labs.com/blog/2024/11/11/cut-the-chit-chat-with-artifacts/" rel="nofollow">Cut the Chit-Chat with Artifacts, by John</a></li>
<li><a href="https://amzn.to/4gChsFf" rel="nofollow">Prompt Engineering for LLMs by John and Albert Ziegler</a></li>
<li><a href="https://amzn.to/3TXmDHk" rel="nofollow">Relevant Search by John and Doug Turnbull</a></li>
<li><a href="https://arcturus-labs.com/" rel="nofollow">Arcturus Labs</a></li>
<li><a href="https://youtu.be/mKTQGKIUq8M" rel="nofollow">Watch the podcast on YouTube</a></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
</ul>

<p>🎓 Learn more:</p>

<ul>
<li><strong>Hugo&#39;s course (this episode was a guest Q&amp;A from the course):</strong> <a href="https://maven.com/s/course/d56067f338" rel="nofollow">Building LLM Applications for Data Scientists and Software Engineers</a> — <a href="https://maven.com/s/course/d56067f338" rel="nofollow">https://maven.com/s/course/d56067f338</a> </li>
</ul>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p><strong>John Berryman</strong> (Arcturus Labs; early GitHub Copilot engineer; co-author of <em>Relevant Search</em> and <em>Prompt Engineering for LLMs</em>) has spent years figuring out what makes AI applications actually work in production. In this episode, he shares the “seven deadly sins” of LLM development — and the practical fixes that keep projects from stalling.  </p>

<p>From context management to retrieval debugging, John explains the patterns he’s seen succeed, the mistakes to avoid, and why it helps to think of an LLM as an “AI intern” rather than an all-knowing oracle.  </p>

<p><strong>We talk through:</strong>  </p>

<ul>
<li>Why chasing perfect accuracy is a dead end<br></li>
<li>How to use agents without losing control<br></li>
<li>Context engineering: fitting the right information in the window<br></li>
<li>Starting simple instead of over-orchestrating<br></li>
<li>Separating retrieval from generation in RAG<br></li>
<li>Splitting complex extractions into smaller checks<br></li>
<li>Knowing when frameworks help — and when they slow you down<br></li>
</ul>

<p>A practical guide to avoiding the common traps of LLM development and building systems that actually hold up in production.</p>

<p><strong>LINKS:</strong></p>

<ul>
<li><a href="https://maven.com/p/4485aa/context-engineering-for-ai-agents" rel="nofollow">Context Engineering for AI Agents, a free, upcoming lightning lesson from John and Hugo</a></li>
<li><a href="https://maven.com/p/a8195d/the-hidden-simplicity-of-gen-ai-systems" rel="nofollow">The Hidden Simplicity of GenAI Systems, a previous lightning lesson from John and Hugo</a></li>
<li><a href="https://arcturus-labs.com/blog/2024/11/21/roaming-rag--rag-without-the-vector-database/" rel="nofollow">Roaming RAG – RAG without the Vector Database, by John</a></li>
<li><a href="https://arcturus-labs.com/blog/2024/11/11/cut-the-chit-chat-with-artifacts/" rel="nofollow">Cut the Chit-Chat with Artifacts, by John</a></li>
<li><a href="https://amzn.to/4gChsFf" rel="nofollow">Prompt Engineering for LLMs by John and Albert Ziegler</a></li>
<li><a href="https://amzn.to/3TXmDHk" rel="nofollow">Relevant Search by John and Doug Turnbull</a></li>
<li><a href="https://arcturus-labs.com/" rel="nofollow">Arcturus Labs</a></li>
<li><a href="https://youtu.be/mKTQGKIUq8M" rel="nofollow">Watch the podcast on YouTube</a></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
</ul>

<p>🎓 Learn more:</p>

<ul>
<li><strong>Hugo&#39;s course (this episode was a guest Q&amp;A from the course):</strong> <a href="https://maven.com/s/course/d56067f338" rel="nofollow">Building LLM Applications for Data Scientists and Software Engineers</a> — <a href="https://maven.com/s/course/d56067f338" rel="nofollow">https://maven.com/s/course/d56067f338</a> </li>
</ul>]]>
  </itunes:summary>
</item>
<item>
  <title>Episode 54: Scaling AI: From Colab to Clusters — A Practitioner’s Guide to Distributed Training and Inference</title>
  <link>https://vanishinggradients.fireside.fm/54</link>
  <guid isPermaLink="false">151b5251-bd41-4528-87bf-763165b8ccc7</guid>
  <pubDate>Sat, 19 Jul 2025 02:00:00 +1000</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/151b5251-bd41-4528-87bf-763165b8ccc7.mp3" length="59469240" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:season>1</itunes:season>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>Colab is cozy. But production won’t fit on a single GPU. Zach Mueller leads Accelerate at Hugging Face and spends his days helping people go from solo scripts to scalable systems. In this episode, he joins me to demystify distributed training and inference — not just for research labs, but for any ML engineer trying to ship real software.</itunes:subtitle>
  <itunes:duration>41:17</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>Colab is cozy. But production won’t fit on a single GPU.
Zach Mueller leads Accelerate at Hugging Face and spends his days helping people go from solo scripts to scalable systems. In this episode, he joins me to demystify distributed training and inference — not just for research labs, but for any ML engineer trying to ship real software.
We talk through:
    • From Colab to clusters: why scaling isn’t just about training massive models, but serving agents, handling load, and speeding up iteration
    • Zero-to-two GPUs: how to get started without Kubernetes, Slurm, or a PhD in networking
    • Scaling tradeoffs: when to care about interconnects, which infra bottlenecks actually matter, and how to avoid chasing performance ghosts
    • The GPU middle class: strategies for training and serving on a shoestring, with just a few cards or modest credits
    • Local experiments, global impact: why learning distributed systems—even just a little—can set you apart as an engineer
If you’ve ever stared at a Hugging Face training script and wondered how to run it on something more than your laptop: this one’s for you.
LINKS
Zach on LinkedIn (https://www.linkedin.com/in/zachary-mueller-135257118/)
Hugo's blog post on Stop Buliding AI Agents (https://www.linkedin.com/posts/hugo-bowne-anderson-045939a5_yesterday-i-posted-about-stop-building-ai-activity-7346942036752613376-b8-t/)
Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk)
Hugo's recent newsletter about upcoming events and more! (https://hugobowne.substack.com/p/stop-building-agents)
🎓 Learn more:
Hugo's course: Building LLM Applications for Data Scientists and Software Engineers (https://maven.com/s/course/d56067f338) — https://maven.com/s/course/d56067f338
Zach's course (45% off for VG listeners!): Scratch to Scale: Large-Scale Training in the Modern World (https://maven.com/walk-with-code/scratch-to-scale?promoCode=hugo39) -- https://maven.com/walk-with-code/scratch-to-scale?promoCode=hugo39
📺 Watch the video version on YouTube: YouTube link (https://youtube.com/live/76NAtzWZ25s?feature=share) 
</description>
  <itunes:keywords>AI, LLM, compute, GenAI</itunes:keywords>
  <content:encoded>
    <![CDATA[<p><strong>Colab is cozy. But production won’t fit on a single GPU.</strong><br>
Zach Mueller leads Accelerate at Hugging Face and spends his days helping people go from solo scripts to scalable systems. In this episode, he joins me to demystify distributed training and inference — not just for research labs, but for any ML engineer trying to ship real software.</p>

<p>We talk through:<br>
    • From Colab to clusters: why scaling isn’t just about training massive models, but serving agents, handling load, and speeding up iteration<br>
    • Zero-to-two GPUs: how to get started without Kubernetes, Slurm, or a PhD in networking<br>
    • Scaling tradeoffs: when to care about interconnects, which infra bottlenecks actually matter, and how to avoid chasing performance ghosts<br>
    • The GPU middle class: strategies for training and serving on a shoestring, with just a few cards or modest credits<br>
    • Local experiments, global impact: why learning distributed systems—even just a little—can set you apart as an engineer</p>

<p>If you’ve ever stared at a Hugging Face training script and wondered how to run it on something more than your laptop: this one’s for you.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://www.linkedin.com/in/zachary-mueller-135257118/" rel="nofollow">Zach on LinkedIn</a></li>
<li><a href="https://www.linkedin.com/posts/hugo-bowne-anderson-045939a5_yesterday-i-posted-about-stop-building-ai-activity-7346942036752613376-b8-t/" rel="nofollow">Hugo&#39;s blog post on Stop Buliding AI Agents</a></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://hugobowne.substack.com/p/stop-building-agents" rel="nofollow">Hugo&#39;s recent newsletter about upcoming events and more!</a></li>
</ul>

<p>🎓 Learn more:</p>

<ul>
<li><strong>Hugo&#39;s course:</strong> <a href="https://maven.com/s/course/d56067f338" rel="nofollow">Building LLM Applications for Data Scientists and Software Engineers</a> — <a href="https://maven.com/s/course/d56067f338" rel="nofollow">https://maven.com/s/course/d56067f338</a></li>
<li><strong>Zach&#39;s course (45% off for VG listeners!):</strong> <a href="https://maven.com/walk-with-code/scratch-to-scale?promoCode=hugo39" rel="nofollow">Scratch to Scale: Large-Scale Training in the Modern World</a> -- <a href="https://maven.com/walk-with-code/scratch-to-scale?promoCode=hugo39" rel="nofollow">https://maven.com/walk-with-code/scratch-to-scale?promoCode=hugo39</a></li>
</ul>

<p>📺 <strong>Watch the video version on YouTube:</strong> <a href="https://youtube.com/live/76NAtzWZ25s?feature=share" rel="nofollow">YouTube link</a></p>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p><strong>Colab is cozy. But production won’t fit on a single GPU.</strong><br>
Zach Mueller leads Accelerate at Hugging Face and spends his days helping people go from solo scripts to scalable systems. In this episode, he joins me to demystify distributed training and inference — not just for research labs, but for any ML engineer trying to ship real software.</p>

<p>We talk through:<br>
    • From Colab to clusters: why scaling isn’t just about training massive models, but serving agents, handling load, and speeding up iteration<br>
    • Zero-to-two GPUs: how to get started without Kubernetes, Slurm, or a PhD in networking<br>
    • Scaling tradeoffs: when to care about interconnects, which infra bottlenecks actually matter, and how to avoid chasing performance ghosts<br>
    • The GPU middle class: strategies for training and serving on a shoestring, with just a few cards or modest credits<br>
    • Local experiments, global impact: why learning distributed systems—even just a little—can set you apart as an engineer</p>

<p>If you’ve ever stared at a Hugging Face training script and wondered how to run it on something more than your laptop: this one’s for you.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://www.linkedin.com/in/zachary-mueller-135257118/" rel="nofollow">Zach on LinkedIn</a></li>
<li><a href="https://www.linkedin.com/posts/hugo-bowne-anderson-045939a5_yesterday-i-posted-about-stop-building-ai-activity-7346942036752613376-b8-t/" rel="nofollow">Hugo&#39;s blog post on Stop Buliding AI Agents</a></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://hugobowne.substack.com/p/stop-building-agents" rel="nofollow">Hugo&#39;s recent newsletter about upcoming events and more!</a></li>
</ul>

<p>🎓 Learn more:</p>

<ul>
<li><strong>Hugo&#39;s course:</strong> <a href="https://maven.com/s/course/d56067f338" rel="nofollow">Building LLM Applications for Data Scientists and Software Engineers</a> — <a href="https://maven.com/s/course/d56067f338" rel="nofollow">https://maven.com/s/course/d56067f338</a></li>
<li><strong>Zach&#39;s course (45% off for VG listeners!):</strong> <a href="https://maven.com/walk-with-code/scratch-to-scale?promoCode=hugo39" rel="nofollow">Scratch to Scale: Large-Scale Training in the Modern World</a> -- <a href="https://maven.com/walk-with-code/scratch-to-scale?promoCode=hugo39" rel="nofollow">https://maven.com/walk-with-code/scratch-to-scale?promoCode=hugo39</a></li>
</ul>

<p>📺 <strong>Watch the video version on YouTube:</strong> <a href="https://youtube.com/live/76NAtzWZ25s?feature=share" rel="nofollow">YouTube link</a></p>]]>
  </itunes:summary>
</item>
<item>
  <title>Episode 52: Why Most LLM Products Break at Retrieval (And How to Fix Them) </title>
  <link>https://vanishinggradients.fireside.fm/52</link>
  <guid isPermaLink="false">258dd611-e817-4971-a655-f07343b967e4</guid>
  <pubDate>Thu, 03 Jul 2025 02:00:00 +1000</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/258dd611-e817-4971-a655-f07343b967e4.mp3" length="27489267" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:season>1</itunes:season>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>Most LLM-powered features do not break at the model. They break at the context. So how do you retrieve the right information to get useful results, even under vague or messy user queries?

In this episode, we hear from Eric Ma, who leads data science research in the Data Science and AI group at Moderna. He shares what it takes to move beyond toy demos and ship LLM features that actually help people do their jobs.</itunes:subtitle>
  <itunes:duration>28:38</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>Most LLM-powered features do not break at the model. They break at the context. So how do you retrieve the right information to get useful results, even under vague or messy user queries?
In this episode, we hear from Eric Ma, who leads data science research in the Data Science and AI group at Moderna. He shares what it takes to move beyond toy demos and ship LLM features that actually help people do their jobs.
We cover:
• How to align retrieval with user intent and why cosine similarity is not the answer
• How a dumb YAML-based system outperformed so-called smart retrieval pipelines
• Why vague queries like “what is this all about” expose real weaknesses in most systems
• When vibe checks are enough and when formal evaluation is worth the effort
• How retrieval workflows can evolve alongside your product and user needs
If you are building LLM-powered systems and care about how they work, not just whether they work, this one is for you.
LINKS
Eric's website (https://ericmjl.github.io/)
Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk)
Hugo's recent newsletter about upcoming events and more! (https://hugobowne.substack.com/p/stop-building-agents)
🎓 Learn more:
Hugo's course: Building LLM Applications for Data Scientists and Software Engineers (https://maven.com/s/course/d56067f338) — next cohort starts July 8: https://maven.com/s/course/d56067f338
📺 Watch the video version on YouTube: YouTube link (https://youtu.be/d-FaR5Ywd5k)
</description>
  <itunes:keywords>data science, machine learning, AI, LLMs, GenAI</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>Most LLM-powered features do not break at the model. They break at the context. So how do you retrieve the right information to get useful results, even under vague or messy user queries?</p>

<p>In this episode, we hear from Eric Ma, who leads data science research in the Data Science and AI group at Moderna. He shares what it takes to move beyond toy demos and ship LLM features that actually help people do their jobs.</p>

<p>We cover:<br>
• How to align retrieval with user intent and why cosine similarity is not the answer<br>
• How a dumb YAML-based system outperformed so-called smart retrieval pipelines<br>
• Why vague queries like “what is this all about” expose real weaknesses in most systems<br>
• When vibe checks are enough and when formal evaluation is worth the effort<br>
• How retrieval workflows can evolve alongside your product and user needs</p>

<p>If you are building LLM-powered systems and care about how they work, not just whether they work, this one is for you.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://ericmjl.github.io/" rel="nofollow">Eric&#39;s website</a></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://hugobowne.substack.com/p/stop-building-agents" rel="nofollow">Hugo&#39;s recent newsletter about upcoming events and more!</a></li>
</ul>

<p>🎓 Learn more:</p>

<ul>
<li><strong>Hugo&#39;s course:</strong> <a href="https://maven.com/s/course/d56067f338" rel="nofollow">Building LLM Applications for Data Scientists and Software Engineers</a> — next cohort starts July 8: <a href="https://maven.com/s/course/d56067f338" rel="nofollow">https://maven.com/s/course/d56067f338</a></li>
</ul>

<p>📺 <strong>Watch the video version on YouTube:</strong> <a href="https://youtu.be/d-FaR5Ywd5k" rel="nofollow">YouTube link</a></p>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>Most LLM-powered features do not break at the model. They break at the context. So how do you retrieve the right information to get useful results, even under vague or messy user queries?</p>

<p>In this episode, we hear from Eric Ma, who leads data science research in the Data Science and AI group at Moderna. He shares what it takes to move beyond toy demos and ship LLM features that actually help people do their jobs.</p>

<p>We cover:<br>
• How to align retrieval with user intent and why cosine similarity is not the answer<br>
• How a dumb YAML-based system outperformed so-called smart retrieval pipelines<br>
• Why vague queries like “what is this all about” expose real weaknesses in most systems<br>
• When vibe checks are enough and when formal evaluation is worth the effort<br>
• How retrieval workflows can evolve alongside your product and user needs</p>

<p>If you are building LLM-powered systems and care about how they work, not just whether they work, this one is for you.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://ericmjl.github.io/" rel="nofollow">Eric&#39;s website</a></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://hugobowne.substack.com/p/stop-building-agents" rel="nofollow">Hugo&#39;s recent newsletter about upcoming events and more!</a></li>
</ul>

<p>🎓 Learn more:</p>

<ul>
<li><strong>Hugo&#39;s course:</strong> <a href="https://maven.com/s/course/d56067f338" rel="nofollow">Building LLM Applications for Data Scientists and Software Engineers</a> — next cohort starts July 8: <a href="https://maven.com/s/course/d56067f338" rel="nofollow">https://maven.com/s/course/d56067f338</a></li>
</ul>

<p>📺 <strong>Watch the video version on YouTube:</strong> <a href="https://youtu.be/d-FaR5Ywd5k" rel="nofollow">YouTube link</a></p>]]>
  </itunes:summary>
</item>
<item>
  <title>Episode 51: Why We Built an MCP Server and What Broke First</title>
  <link>https://vanishinggradients.fireside.fm/51</link>
  <guid isPermaLink="false">c45cdd9e-56a6-4b90-8ccf-3acd0c697415</guid>
  <pubDate>Fri, 27 Jun 2025 03:00:00 +1000</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/c45cdd9e-56a6-4b90-8ccf-3acd0c697415.mp3" length="45788781" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:season>1</itunes:season>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>What does it take to actually ship LLM-powered features, and what breaks when you connect them to real production data?

In this episode, we hear from Philip Carter — then a Principal PM at Honeycomb and now a Product Management Director at Salesforce. In early 2023, he helped build one of the first LLM-powered SaaS features to ship to real users. More recently, he and his team built a production-ready MCP server.</itunes:subtitle>
  <itunes:duration>47:41</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>What does it take to actually ship LLM-powered features, and what breaks when you connect them to real production data?
In this episode, we hear from Philip Carter — then a Principal PM at Honeycomb and now a Product Management Director at Salesforce. In early 2023, he helped build one of the first LLM-powered SaaS features to ship to real users. More recently, he and his team built a production-ready MCP server.
We cover:
    • How to evaluate LLM systems using human-aligned judges
    • The spreadsheet-driven process behind shipping Honeycomb’s first LLM feature
    • The challenges of tool usage, prompt templates, and flaky model behavior
    • Where MCP shows promise, and where it breaks in the real world
If you’re working on LLMs in production, this one’s for you!
LINKS
So We Shipped an AI Product: Did it Work? by Philip Carter (https://www.honeycomb.io/blog/we-shipped-ai-product)
Vanishing Gradients YouTube Channel (https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA)  
Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk)
Hugo's recent newsletter about upcoming events and more! (https://hugobowne.substack.com/p/ai-as-a-civilizational-technology)
🎓 Learn more:
Hugo's course: Building LLM Applications for Data Scientists and Software Engineers (https://maven.com/s/course/d56067f338) — next cohort starts July 8: https://maven.com/s/course/d56067f338
📺 Watch the video version on YouTube: YouTube link (https://youtu.be/JDMzdaZh9Ig) 
</description>
  <itunes:keywords>data science, machine learning, AI, LLMs</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>What does it take to actually ship LLM-powered features, and what breaks when you connect them to real production data?</p>

<p>In this episode, we hear from Philip Carter — then a Principal PM at Honeycomb and now a Product Management Director at Salesforce. In early 2023, he helped build one of the first LLM-powered SaaS features to ship to real users. More recently, he and his team built a production-ready MCP server.</p>

<p>We cover:<br>
    • How to evaluate LLM systems using human-aligned judges<br>
    • The spreadsheet-driven process behind shipping Honeycomb’s first LLM feature<br>
    • The challenges of tool usage, prompt templates, and flaky model behavior<br>
    • Where MCP shows promise, and where it breaks in the real world</p>

<p>If you’re working on LLMs in production, this one’s for you!</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://www.honeycomb.io/blog/we-shipped-ai-product" rel="nofollow">So We Shipped an AI Product: Did it Work? by Philip Carter</a></li>
<li><a href="https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA" rel="nofollow">Vanishing Gradients YouTube Channel</a><br></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://hugobowne.substack.com/p/ai-as-a-civilizational-technology" rel="nofollow">Hugo&#39;s recent newsletter about upcoming events and more!</a></li>
</ul>

<p>🎓 Learn more:</p>

<ul>
<li><strong>Hugo&#39;s course:</strong> <a href="https://maven.com/s/course/d56067f338" rel="nofollow">Building LLM Applications for Data Scientists and Software Engineers</a> — next cohort starts July 8: <a href="https://maven.com/s/course/d56067f338" rel="nofollow">https://maven.com/s/course/d56067f338</a></li>
</ul>

<p>📺 <strong>Watch the video version on YouTube:</strong> <a href="https://youtu.be/JDMzdaZh9Ig" rel="nofollow">YouTube link</a></p>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>What does it take to actually ship LLM-powered features, and what breaks when you connect them to real production data?</p>

<p>In this episode, we hear from Philip Carter — then a Principal PM at Honeycomb and now a Product Management Director at Salesforce. In early 2023, he helped build one of the first LLM-powered SaaS features to ship to real users. More recently, he and his team built a production-ready MCP server.</p>

<p>We cover:<br>
    • How to evaluate LLM systems using human-aligned judges<br>
    • The spreadsheet-driven process behind shipping Honeycomb’s first LLM feature<br>
    • The challenges of tool usage, prompt templates, and flaky model behavior<br>
    • Where MCP shows promise, and where it breaks in the real world</p>

<p>If you’re working on LLMs in production, this one’s for you!</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://www.honeycomb.io/blog/we-shipped-ai-product" rel="nofollow">So We Shipped an AI Product: Did it Work? by Philip Carter</a></li>
<li><a href="https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA" rel="nofollow">Vanishing Gradients YouTube Channel</a><br></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://hugobowne.substack.com/p/ai-as-a-civilizational-technology" rel="nofollow">Hugo&#39;s recent newsletter about upcoming events and more!</a></li>
</ul>

<p>🎓 Learn more:</p>

<ul>
<li><strong>Hugo&#39;s course:</strong> <a href="https://maven.com/s/course/d56067f338" rel="nofollow">Building LLM Applications for Data Scientists and Software Engineers</a> — next cohort starts July 8: <a href="https://maven.com/s/course/d56067f338" rel="nofollow">https://maven.com/s/course/d56067f338</a></li>
</ul>

<p>📺 <strong>Watch the video version on YouTube:</strong> <a href="https://youtu.be/JDMzdaZh9Ig" rel="nofollow">YouTube link</a></p>]]>
  </itunes:summary>
</item>
<item>
  <title>Episode 50: A Field Guide to Rapidly Improving AI Products -- With Hamel Husain</title>
  <link>https://vanishinggradients.fireside.fm/50</link>
  <guid isPermaLink="false">3851d92b-389c-4690-90c3-8a54ad73b7d8</guid>
  <pubDate>Tue, 17 Jun 2025 18:30:00 +1000</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/3851d92b-389c-4690-90c3-8a54ad73b7d8.mp3" length="54176426" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:season>1</itunes:season>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>Hugo talks with Hamel Hussain (ex-Airbnb, GitHub, DataRobot) about how to improve AI products through evaluation, error analysis, and iteration. They discuss why most teams overlook debugging LLM systems, how to prioritize what to fix, and why evals are not just metrics—but a full development process.</itunes:subtitle>
  <itunes:duration>27:42</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>If we want AI systems that actually work, we need to get much better at evaluating them, not just building more pipelines, agents, and frameworks.
In this episode, Hugo talks with Hamel Hussain (ex-Airbnb, GitHub, DataRobot) about how teams can improve AI products by focusing on error analysis, data inspection, and systematic iteration. The conversation is based on Hamel’s blog post A Field Guide to Rapidly Improving AI Products, which he joined Hugo’s class to discuss.
They cover:
🔍 Why most teams struggle to measure whether their systems are actually improving  
📊 How error analysis helps you prioritize what to fix (and when to write evals)  
🧮 Why evaluation isn’t just a metric — but a full development process  
⚠️ Common mistakes when debugging LLM and agent systems  
🛠️ How to think about the tradeoffs in adding more evals vs. fixing obvious issues  
👥 Why enabling domain experts — not just engineers — can accelerate iteration
If you’ve ever built an AI system and found yourself unsure how to make it better, this conversation is for you.
LINKS
* A Field Guide to Rapidly Improving AI Products by Hamel Husain (https://hamel.dev/blog/posts/field-guide/)
* Vanishing Gradients YouTube Channel (https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA)  
* Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk)
* Hugo's recent newsletter about upcoming events and more! (https://hugobowne.substack.com/p/ai-as-a-civilizational-technology)
🎓 Learn more:
Hugo's course: Building LLM Applications for Data Scientists and Software Engineers (https://maven.com/s/course/d56067f338) — next cohort starts July 8: https://maven.com/s/course/d56067f338
Hamel &amp;amp; Shreya's course: AI Evals For Engineers &amp;amp; PMs (https://maven.com/parlance-labs/evals?promoCode=GOHUGORGOHOME) — use code GOHUGORGOHOME for $800 off
📺 Watch the video version on YouTube: YouTube link (https://youtu.be/rWToRi2_SeY) 
</description>
  <itunes:keywords>data science, machine learning, AI, LLMs, evas</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>If we want AI systems that actually work, we need to get much better at evaluating them, not just building more pipelines, agents, and frameworks.</p>

<p>In this episode, Hugo talks with Hamel Hussain (ex-Airbnb, GitHub, DataRobot) about how teams can improve AI products by focusing on error analysis, data inspection, and systematic iteration. The conversation is based on Hamel’s blog post <em>A Field Guide to Rapidly Improving AI Products</em>, which he joined Hugo’s class to discuss.</p>

<p>They cover:<br>
🔍 Why most teams struggle to measure whether their systems are actually improving<br><br>
📊 How error analysis helps you prioritize what to fix (and when to write evals)<br><br>
🧮 Why evaluation isn’t just a metric — but a full development process<br><br>
⚠️ Common mistakes when debugging LLM and agent systems<br><br>
🛠️ How to think about the tradeoffs in adding more evals vs. fixing obvious issues<br><br>
👥 Why enabling domain experts — not just engineers — can accelerate iteration</p>

<p>If you’ve ever built an AI system and found yourself unsure how to make it better, this conversation is for you.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://hamel.dev/blog/posts/field-guide/" rel="nofollow">A Field Guide to Rapidly Improving AI Products by Hamel Husain</a></li>
<li><a href="https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA" rel="nofollow">Vanishing Gradients YouTube Channel</a><br></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://hugobowne.substack.com/p/ai-as-a-civilizational-technology" rel="nofollow">Hugo&#39;s recent newsletter about upcoming events and more!</a></li>
</ul>

<hr>

<p>🎓 Learn more:</p>

<ul>
<li><strong>Hugo&#39;s course:</strong> <a href="https://maven.com/s/course/d56067f338" rel="nofollow">Building LLM Applications for Data Scientists and Software Engineers</a> — next cohort starts July 8: <a href="https://maven.com/s/course/d56067f338" rel="nofollow">https://maven.com/s/course/d56067f338</a></li>
<li><strong>Hamel &amp; Shreya&#39;s course:</strong> <a href="https://maven.com/parlance-labs/evals?promoCode=GOHUGORGOHOME" rel="nofollow">AI Evals For Engineers &amp; PMs</a> — use code <code>GOHUGORGOHOME</code> for $800 off</li>
</ul>

<p>📺 <strong>Watch the video version on YouTube:</strong> <a href="https://youtu.be/rWToRi2_SeY" rel="nofollow">YouTube link</a></p>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>If we want AI systems that actually work, we need to get much better at evaluating them, not just building more pipelines, agents, and frameworks.</p>

<p>In this episode, Hugo talks with Hamel Hussain (ex-Airbnb, GitHub, DataRobot) about how teams can improve AI products by focusing on error analysis, data inspection, and systematic iteration. The conversation is based on Hamel’s blog post <em>A Field Guide to Rapidly Improving AI Products</em>, which he joined Hugo’s class to discuss.</p>

<p>They cover:<br>
🔍 Why most teams struggle to measure whether their systems are actually improving<br><br>
📊 How error analysis helps you prioritize what to fix (and when to write evals)<br><br>
🧮 Why evaluation isn’t just a metric — but a full development process<br><br>
⚠️ Common mistakes when debugging LLM and agent systems<br><br>
🛠️ How to think about the tradeoffs in adding more evals vs. fixing obvious issues<br><br>
👥 Why enabling domain experts — not just engineers — can accelerate iteration</p>

<p>If you’ve ever built an AI system and found yourself unsure how to make it better, this conversation is for you.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://hamel.dev/blog/posts/field-guide/" rel="nofollow">A Field Guide to Rapidly Improving AI Products by Hamel Husain</a></li>
<li><a href="https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA" rel="nofollow">Vanishing Gradients YouTube Channel</a><br></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://hugobowne.substack.com/p/ai-as-a-civilizational-technology" rel="nofollow">Hugo&#39;s recent newsletter about upcoming events and more!</a></li>
</ul>

<hr>

<p>🎓 Learn more:</p>

<ul>
<li><strong>Hugo&#39;s course:</strong> <a href="https://maven.com/s/course/d56067f338" rel="nofollow">Building LLM Applications for Data Scientists and Software Engineers</a> — next cohort starts July 8: <a href="https://maven.com/s/course/d56067f338" rel="nofollow">https://maven.com/s/course/d56067f338</a></li>
<li><strong>Hamel &amp; Shreya&#39;s course:</strong> <a href="https://maven.com/parlance-labs/evals?promoCode=GOHUGORGOHOME" rel="nofollow">AI Evals For Engineers &amp; PMs</a> — use code <code>GOHUGORGOHOME</code> for $800 off</li>
</ul>

<p>📺 <strong>Watch the video version on YouTube:</strong> <a href="https://youtu.be/rWToRi2_SeY" rel="nofollow">YouTube link</a></p>]]>
  </itunes:summary>
</item>
<item>
  <title>Episode 49: Why Data and AI Still Break at Scale (and What to Do About It)</title>
  <link>https://vanishinggradients.fireside.fm/49</link>
  <guid isPermaLink="false">309762f9-59cd-4f24-bea5-8e692a0d870f</guid>
  <pubDate>Thu, 05 Jun 2025 14:00:00 +1000</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/309762f9-59cd-4f24-bea5-8e692a0d870f.mp3" length="117738811" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:season>1</itunes:season>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>Hugo talks with Akshay Agrawal (Marimo, ex-Google Brain, Netflix, Stanford) about why data and AI systems still break at scale—and what it takes to fix them. They dive into the limits of existing workflows, the importance of reproducibility and reactive execution, and how Marimo reimagines notebooks for modern software development.</itunes:subtitle>
  <itunes:duration>1:21:45</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>If we want AI systems that actually work in production, we need better infrastructure—not just better models.
In this episode, Hugo talks with Akshay Agrawal (Marimo, ex-Google Brain, Netflix, Stanford) about why data and AI pipelines still break down at scale, and how we can fix the fundamentals: reproducibility, composability, and reliable execution.
They discuss:
🔁 Why reactive execution matters—and how current tools fall short
🛠️ The design goals behind Marimo, a new kind of Python notebook
⚙️ The hidden costs of traditional workflows (and what breaks at scale)
📦 What it takes to build modular, maintainable AI apps
🧪 Why debugging LLM systems is so hard—and what better tooling looks like
🌍 What we can learn from decades of tools built for and by data practitioners
Toward the end of the episode, Hugo and Akshay walk through two live demos: Hugo shares how he’s been using Marimo to prototype an app that extracts structured data from world leader bios, and Akshay shows how Marimo handles agentic workflows with memory and tool use—built entirely in a notebook.
This episode is about tools, but it’s also about culture. If you’ve ever hit a wall with your current stack—or felt like your tools were working against you—this one’s for you.
LINKS
* marimo | a next-generation Python notebook (https://marimo.io/)
* SciPy conference, 2025 (https://www.scipy2025.scipy.org/)
* Hugo's face Marimo World Leader Face Embedding demo (https://www.youtube.com/watch?v=DO21QEcLOxM)
* Vanishing Gradients YouTube Channel (https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA)  
* Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk)
* Hugo's recent newsletter about upcoming events and more! (https://hugobowne.substack.com/p/ai-as-a-civilizational-technology)
* Watch the podcast here on YouTube! (https://youtube.com/live/WVxAz19tgZY?feature=share)
🎓 Want to go deeper?
Check out Hugo's course: Building LLM Applications for Data Scientists and Software Engineers.
Learn how to design, test, and deploy production-grade LLM systems — with observability, feedback loops, and structure built in.
This isn’t about vibes or fragile agents. It’s about making LLMs reliable, testable, and actually useful.
Includes over $800 in compute credits and guest lectures from experts at DeepMind, Moderna, and more.
Cohort starts July 8 — Use this link for a 10% discount (https://maven.com/hugo-stefan/building-llm-apps-ds-and-swe-from-first-principles?promoCode=LLM10) 
</description>
  <itunes:keywords>data science, machine learning, AI, LLMs</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>If we want AI systems that actually work in production, we need better infrastructure—not just better models.</p>

<p>In this episode, Hugo talks with Akshay Agrawal (Marimo, ex-Google Brain, Netflix, Stanford) about why data and AI pipelines still break down at scale, and how we can fix the fundamentals: reproducibility, composability, and reliable execution.</p>

<p>They discuss:<br>
🔁 Why reactive execution matters—and how current tools fall short<br>
🛠️ The design goals behind Marimo, a new kind of Python notebook<br>
⚙️ The hidden costs of traditional workflows (and what breaks at scale)<br>
📦 What it takes to build modular, maintainable AI apps<br>
🧪 Why debugging LLM systems is so hard—and what better tooling looks like<br>
🌍 What we can learn from decades of tools built for and by data practitioners</p>

<p>Toward the end of the episode, Hugo and Akshay walk through two live demos: Hugo shares how he’s been using Marimo to prototype an app that extracts structured data from world leader bios, and Akshay shows how Marimo handles agentic workflows with memory and tool use—built entirely in a notebook.</p>

<p>This episode is about tools, but it’s also about culture. If you’ve ever hit a wall with your current stack—or felt like your tools were working against you—this one’s for you.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://marimo.io/" rel="nofollow">marimo | a next-generation Python notebook</a></li>
<li><a href="https://www.scipy2025.scipy.org/" rel="nofollow">SciPy conference, 2025</a></li>
<li><a href="https://www.youtube.com/watch?v=DO21QEcLOxM" rel="nofollow">Hugo&#39;s face Marimo World Leader Face Embedding demo</a></li>
<li><a href="https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA" rel="nofollow">Vanishing Gradients YouTube Channel</a><br></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://hugobowne.substack.com/p/ai-as-a-civilizational-technology" rel="nofollow">Hugo&#39;s recent newsletter about upcoming events and more!</a></li>
<li><a href="https://youtube.com/live/WVxAz19tgZY?feature=share" rel="nofollow">Watch the podcast here on YouTube!</a></li>
</ul>

<p>🎓 Want to go deeper?<br>
Check out Hugo&#39;s course: <em>Building LLM Applications for Data Scientists and Software Engineers.</em><br>
Learn how to design, test, and deploy production-grade LLM systems — with observability, feedback loops, and structure built in.<br>
This isn’t about vibes or fragile agents. It’s about making LLMs reliable, testable, and actually useful.</p>

<p>Includes over $800 in compute credits and guest lectures from experts at DeepMind, Moderna, and more.<br>
Cohort starts July 8 — <a href="https://maven.com/hugo-stefan/building-llm-apps-ds-and-swe-from-first-principles?promoCode=LLM10" rel="nofollow">Use this link for a 10% discount</a></p>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>If we want AI systems that actually work in production, we need better infrastructure—not just better models.</p>

<p>In this episode, Hugo talks with Akshay Agrawal (Marimo, ex-Google Brain, Netflix, Stanford) about why data and AI pipelines still break down at scale, and how we can fix the fundamentals: reproducibility, composability, and reliable execution.</p>

<p>They discuss:<br>
🔁 Why reactive execution matters—and how current tools fall short<br>
🛠️ The design goals behind Marimo, a new kind of Python notebook<br>
⚙️ The hidden costs of traditional workflows (and what breaks at scale)<br>
📦 What it takes to build modular, maintainable AI apps<br>
🧪 Why debugging LLM systems is so hard—and what better tooling looks like<br>
🌍 What we can learn from decades of tools built for and by data practitioners</p>

<p>Toward the end of the episode, Hugo and Akshay walk through two live demos: Hugo shares how he’s been using Marimo to prototype an app that extracts structured data from world leader bios, and Akshay shows how Marimo handles agentic workflows with memory and tool use—built entirely in a notebook.</p>

<p>This episode is about tools, but it’s also about culture. If you’ve ever hit a wall with your current stack—or felt like your tools were working against you—this one’s for you.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://marimo.io/" rel="nofollow">marimo | a next-generation Python notebook</a></li>
<li><a href="https://www.scipy2025.scipy.org/" rel="nofollow">SciPy conference, 2025</a></li>
<li><a href="https://www.youtube.com/watch?v=DO21QEcLOxM" rel="nofollow">Hugo&#39;s face Marimo World Leader Face Embedding demo</a></li>
<li><a href="https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA" rel="nofollow">Vanishing Gradients YouTube Channel</a><br></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://hugobowne.substack.com/p/ai-as-a-civilizational-technology" rel="nofollow">Hugo&#39;s recent newsletter about upcoming events and more!</a></li>
<li><a href="https://youtube.com/live/WVxAz19tgZY?feature=share" rel="nofollow">Watch the podcast here on YouTube!</a></li>
</ul>

<p>🎓 Want to go deeper?<br>
Check out Hugo&#39;s course: <em>Building LLM Applications for Data Scientists and Software Engineers.</em><br>
Learn how to design, test, and deploy production-grade LLM systems — with observability, feedback loops, and structure built in.<br>
This isn’t about vibes or fragile agents. It’s about making LLMs reliable, testable, and actually useful.</p>

<p>Includes over $800 in compute credits and guest lectures from experts at DeepMind, Moderna, and more.<br>
Cohort starts July 8 — <a href="https://maven.com/hugo-stefan/building-llm-apps-ds-and-swe-from-first-principles?promoCode=LLM10" rel="nofollow">Use this link for a 10% discount</a></p>]]>
  </itunes:summary>
</item>
<item>
  <title>Episode 48: HOW TO BENCHMARK AGI WITH GREG KAMRADT</title>
  <link>https://vanishinggradients.fireside.fm/48</link>
  <guid isPermaLink="false">f3c73c48-530c-41aa-acd5-d6efafecd27f</guid>
  <pubDate>Fri, 23 May 2025 23:00:00 +1000</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/f3c73c48-530c-41aa-acd5-d6efafecd27f.mp3" length="126506397" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:season>1</itunes:season>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>Hugo talks with Greg Kamradt, President of the ARC Prize Foundation, about ARC-AGI: a benchmark built on Francois Chollet’s definition of intelligence as “the efficiency at which you learn new things.” Unlike most evals that focus on memorization or task completion, ARC is designed to measure generalization—and expose where today’s top models fall short.</itunes:subtitle>
  <itunes:duration>1:04:25</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>If we want to make progress toward AGI, we need a clear definition of intelligence—and a way to measure it.
In this episode, Hugo talks with Greg Kamradt, President of the ARC Prize Foundation, about ARC-AGI: a benchmark built on Francois Chollet’s definition of intelligence as “the efficiency at which you learn new things.” Unlike most evals that focus on memorization or task completion, ARC is designed to measure generalization—and expose where today’s top models fall short.
They discuss:
🧠 Why we still lack a shared definition of intelligence
🧪 How ARC tasks force models to learn novel skills at test time
📉 Why GPT-4-class models still underperform on ARC
🔎 The limits of traditional benchmarks like MMLU and Big-Bench
⚙️ What the OpenAI O₃ results reveal—and what they don’t
💡 Why generalization and efficiency, not raw capability, are key to AGI
Greg also shares what he’s seeing in the wild: how startups and independent researchers are using ARC as a North Star, how benchmarks shape the frontier, and why the ARC team believes we’ll know we’ve reached AGI when humans can no longer write tasks that models can’t solve.
This conversation is about evaluation—not hype. If you care about where AI is really headed, this one’s worth your time.
LINKS
* ARC Prize -- What is ARC-AGI? (https://arcprize.org/arc-agi)
* On the Measure of Intelligence by François Chollet (https://arxiv.org/abs/1911.01547)
* Greg Kamradt on Twitter (https://x.com/GregKamradt)
* Hugo's High Signal Podcast with Fei-Fei Li (https://high-signal.delphina.ai/episode/fei-fei-on-how-human-centered-ai-actually-gets-built)
* Vanishing Gradients YouTube Channel (https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA)  
* Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk)
* Hugo's recent newsletter about upcoming events and more! (https://hugobowne.substack.com/p/ai-as-a-civilizational-technology)
* Watch the podcast here on YouTube! (https://youtu.be/wU82fz4iRfo)
🎓 Want to go deeper?
Check out Hugo's course: Building LLM Applications for Data Scientists and Software Engineers.
Learn how to design, test, and deploy production-grade LLM systems — with observability, feedback loops, and structure built in.
This isn’t about vibes or fragile agents. It’s about making LLMs reliable, testable, and actually useful.
Includes over $800 in compute credits and guest lectures from experts at DeepMind, Moderna, and more.
Cohort starts July 8 — Use this link for a 10% discount (https://maven.com/hugo-stefan/building-llm-apps-ds-and-swe-from-first-principles?promoCode=LLM10) 
</description>
  <itunes:keywords>data science, machine learning, AI, LLMs, AGI</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>If we want to make progress toward AGI, we need a clear definition of intelligence—and a way to measure it.</p>

<p>In this episode, Hugo talks with Greg Kamradt, President of the ARC Prize Foundation, about ARC-AGI: a benchmark built on Francois Chollet’s definition of intelligence as “the efficiency at which you learn new things.” Unlike most evals that focus on memorization or task completion, ARC is designed to measure generalization—and expose where today’s top models fall short.</p>

<p>They discuss:<br>
🧠 Why we still lack a shared definition of intelligence<br>
🧪 How ARC tasks force models to learn novel skills at test time<br>
📉 Why GPT-4-class models still underperform on ARC<br>
🔎 The limits of traditional benchmarks like MMLU and Big-Bench<br>
⚙️ What the OpenAI O₃ results reveal—and what they don’t<br>
💡 Why generalization and efficiency, not raw capability, are key to AGI</p>

<p>Greg also shares what he’s seeing in the wild: how startups and independent researchers are using ARC as a North Star, how benchmarks shape the frontier, and why the ARC team believes we’ll know we’ve reached AGI when humans can no longer write tasks that models can’t solve.</p>

<p>This conversation is about evaluation—not hype. If you care about where AI is really headed, this one’s worth your time.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://arcprize.org/arc-agi" rel="nofollow">ARC Prize -- What is ARC-AGI?</a></li>
<li><a href="https://arxiv.org/abs/1911.01547" rel="nofollow">On the Measure of Intelligence by François Chollet</a></li>
<li><a href="https://x.com/GregKamradt" rel="nofollow">Greg Kamradt on Twitter</a></li>
<li><a href="https://high-signal.delphina.ai/episode/fei-fei-on-how-human-centered-ai-actually-gets-built" rel="nofollow">Hugo&#39;s High Signal Podcast with Fei-Fei Li</a></li>
<li><a href="https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA" rel="nofollow">Vanishing Gradients YouTube Channel</a><br></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://hugobowne.substack.com/p/ai-as-a-civilizational-technology" rel="nofollow">Hugo&#39;s recent newsletter about upcoming events and more!</a></li>
<li><a href="https://youtu.be/wU82fz4iRfo" rel="nofollow">Watch the podcast here on YouTube!</a></li>
</ul>

<p>🎓 Want to go deeper?<br>
Check out Hugo&#39;s course: <em>Building LLM Applications for Data Scientists and Software Engineers.</em><br>
Learn how to design, test, and deploy production-grade LLM systems — with observability, feedback loops, and structure built in.<br>
This isn’t about vibes or fragile agents. It’s about making LLMs reliable, testable, and actually useful.</p>

<p>Includes over $800 in compute credits and guest lectures from experts at DeepMind, Moderna, and more.<br>
Cohort starts July 8 — <a href="https://maven.com/hugo-stefan/building-llm-apps-ds-and-swe-from-first-principles?promoCode=LLM10" rel="nofollow">Use this link for a 10% discount</a></p>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>If we want to make progress toward AGI, we need a clear definition of intelligence—and a way to measure it.</p>

<p>In this episode, Hugo talks with Greg Kamradt, President of the ARC Prize Foundation, about ARC-AGI: a benchmark built on Francois Chollet’s definition of intelligence as “the efficiency at which you learn new things.” Unlike most evals that focus on memorization or task completion, ARC is designed to measure generalization—and expose where today’s top models fall short.</p>

<p>They discuss:<br>
🧠 Why we still lack a shared definition of intelligence<br>
🧪 How ARC tasks force models to learn novel skills at test time<br>
📉 Why GPT-4-class models still underperform on ARC<br>
🔎 The limits of traditional benchmarks like MMLU and Big-Bench<br>
⚙️ What the OpenAI O₃ results reveal—and what they don’t<br>
💡 Why generalization and efficiency, not raw capability, are key to AGI</p>

<p>Greg also shares what he’s seeing in the wild: how startups and independent researchers are using ARC as a North Star, how benchmarks shape the frontier, and why the ARC team believes we’ll know we’ve reached AGI when humans can no longer write tasks that models can’t solve.</p>

<p>This conversation is about evaluation—not hype. If you care about where AI is really headed, this one’s worth your time.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://arcprize.org/arc-agi" rel="nofollow">ARC Prize -- What is ARC-AGI?</a></li>
<li><a href="https://arxiv.org/abs/1911.01547" rel="nofollow">On the Measure of Intelligence by François Chollet</a></li>
<li><a href="https://x.com/GregKamradt" rel="nofollow">Greg Kamradt on Twitter</a></li>
<li><a href="https://high-signal.delphina.ai/episode/fei-fei-on-how-human-centered-ai-actually-gets-built" rel="nofollow">Hugo&#39;s High Signal Podcast with Fei-Fei Li</a></li>
<li><a href="https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA" rel="nofollow">Vanishing Gradients YouTube Channel</a><br></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
<li><a href="https://hugobowne.substack.com/p/ai-as-a-civilizational-technology" rel="nofollow">Hugo&#39;s recent newsletter about upcoming events and more!</a></li>
<li><a href="https://youtu.be/wU82fz4iRfo" rel="nofollow">Watch the podcast here on YouTube!</a></li>
</ul>

<p>🎓 Want to go deeper?<br>
Check out Hugo&#39;s course: <em>Building LLM Applications for Data Scientists and Software Engineers.</em><br>
Learn how to design, test, and deploy production-grade LLM systems — with observability, feedback loops, and structure built in.<br>
This isn’t about vibes or fragile agents. It’s about making LLMs reliable, testable, and actually useful.</p>

<p>Includes over $800 in compute credits and guest lectures from experts at DeepMind, Moderna, and more.<br>
Cohort starts July 8 — <a href="https://maven.com/hugo-stefan/building-llm-apps-ds-and-swe-from-first-principles?promoCode=LLM10" rel="nofollow">Use this link for a 10% discount</a></p>]]>
  </itunes:summary>
</item>
<item>
  <title>Episode 47: The Great Pacific Garbage Patch of Code Slop with Joe Reis</title>
  <link>https://vanishinggradients.fireside.fm/47</link>
  <guid isPermaLink="false">decc9c1a-f18a-41e9-947a-e58fa0957f1e</guid>
  <pubDate>Mon, 07 Apr 2025 10:00:00 +1000</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/decc9c1a-f18a-41e9-947a-e58fa0957f1e.mp3" length="76045085" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:season>1</itunes:season>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>What if the cost of writing code dropped to zero — but the cost of understanding it skyrocketed?

In this episode, Hugo sits down with Joe Reis to unpack how AI tooling is reshaping the software development lifecycle — from experimentation and prototyping to deployment, maintainability, and everything in between.</itunes:subtitle>
  <itunes:duration>1:19:12</itunes:duration>
  <itunes:explicit>yes</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>What if the cost of writing code dropped to zero — but the cost of understanding it skyrocketed?
In this episode, Hugo sits down with Joe Reis to unpack how AI tooling is reshaping the software development lifecycle — from experimentation and prototyping to deployment, maintainability, and everything in between.
Joe is the co-author of Fundamentals of Data Engineering and a longtime voice on the systems side of modern software. He’s also one of the sharpest critics of “vibe coding” — the emerging pattern of writing software by feel, with heavy reliance on LLMs and little regard for structure or quality.
We dive into:
    • Why “vibe coding” is more than a meme — and what it says about how we build today
    • How AI tools expand the surface area of software creation — for better and worse
    • What happens to technical debt, testing, and security when generation outpaces understanding
    • The changing definition of “production” in a world of ephemeral, internal, or just-good-enough tools
    • How AI is flattening the learning curve — and threatening the talent pipeline
    • Joe’s view on what real craftsmanship means in an age of disposable code
This conversation isn’t about doom, and it’s not about hype. It’s about mapping the real, messy terrain of what it means to build software today — and how to do it with care.
LINKS
* Joe's Practical Data Modeling Newsletter on Substack (https://practicaldatamodeling.substack.com/)
* Joe's Practical Data Modeling Server on Discord (https://discord.gg/HhSZVvWDBb)
* Vanishing Gradients YouTube Channel (https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA)  
* Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk)
🎓 Want to go deeper?
Check out my course: Building LLM Applications for Data Scientists and Software Engineers.
Learn how to design, test, and deploy production-grade LLM systems — with observability, feedback loops, and structure built in.
This isn’t about vibes or fragile agents. It’s about making LLMs reliable, testable, and actually useful.
Includes over $800 in compute credits and guest lectures from experts at DeepMind, Moderna, and more.
Cohort starts July 8 — Use this link for a 10% discount (https://maven.com/hugo-stefan/building-llm-apps-ds-and-swe-from-first-principles?promoCode=LLM10) 
</description>
  <itunes:keywords>AI, LLMs, data science, machine learning, data science, GenAI, vibe coding</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>What if the cost of writing code dropped to zero — but the cost of understanding it skyrocketed?</p>

<p>In this episode, Hugo sits down with Joe Reis to unpack how AI tooling is reshaping the software development lifecycle — from experimentation and prototyping to deployment, maintainability, and everything in between.</p>

<p>Joe is the co-author of Fundamentals of Data Engineering and a longtime voice on the systems side of modern software. He’s also one of the sharpest critics of “vibe coding” — the emerging pattern of writing software by feel, with heavy reliance on LLMs and little regard for structure or quality.</p>

<p>We dive into:<br>
    • Why “vibe coding” is more than a meme — and what it says about how we build today<br>
    • How AI tools expand the surface area of software creation — for better and worse<br>
    • What happens to technical debt, testing, and security when generation outpaces understanding<br>
    • The changing definition of “production” in a world of ephemeral, internal, or just-good-enough tools<br>
    • How AI is flattening the learning curve — and threatening the talent pipeline<br>
    • Joe’s view on what real craftsmanship means in an age of disposable code</p>

<p>This conversation isn’t about doom, and it’s not about hype. It’s about mapping the real, messy terrain of what it means to build software today — and how to do it with care.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://practicaldatamodeling.substack.com/" rel="nofollow">Joe&#39;s Practical Data Modeling Newsletter on Substack</a></li>
<li><a href="https://discord.gg/HhSZVvWDBb" rel="nofollow">Joe&#39;s Practical Data Modeling Server on Discord</a></li>
<li><a href="https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA" rel="nofollow">Vanishing Gradients YouTube Channel</a><br></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
</ul>

<p>🎓 Want to go deeper?<br>
Check out my course: <em>Building LLM Applications for Data Scientists and Software Engineers.</em><br>
Learn how to design, test, and deploy production-grade LLM systems — with observability, feedback loops, and structure built in.<br>
This isn’t about vibes or fragile agents. It’s about making LLMs reliable, testable, and actually useful.</p>

<p>Includes over $800 in compute credits and guest lectures from experts at DeepMind, Moderna, and more.<br>
Cohort starts July 8 — <a href="https://maven.com/hugo-stefan/building-llm-apps-ds-and-swe-from-first-principles?promoCode=LLM10" rel="nofollow">Use this link for a 10% discount</a></p>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>What if the cost of writing code dropped to zero — but the cost of understanding it skyrocketed?</p>

<p>In this episode, Hugo sits down with Joe Reis to unpack how AI tooling is reshaping the software development lifecycle — from experimentation and prototyping to deployment, maintainability, and everything in between.</p>

<p>Joe is the co-author of Fundamentals of Data Engineering and a longtime voice on the systems side of modern software. He’s also one of the sharpest critics of “vibe coding” — the emerging pattern of writing software by feel, with heavy reliance on LLMs and little regard for structure or quality.</p>

<p>We dive into:<br>
    • Why “vibe coding” is more than a meme — and what it says about how we build today<br>
    • How AI tools expand the surface area of software creation — for better and worse<br>
    • What happens to technical debt, testing, and security when generation outpaces understanding<br>
    • The changing definition of “production” in a world of ephemeral, internal, or just-good-enough tools<br>
    • How AI is flattening the learning curve — and threatening the talent pipeline<br>
    • Joe’s view on what real craftsmanship means in an age of disposable code</p>

<p>This conversation isn’t about doom, and it’s not about hype. It’s about mapping the real, messy terrain of what it means to build software today — and how to do it with care.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://practicaldatamodeling.substack.com/" rel="nofollow">Joe&#39;s Practical Data Modeling Newsletter on Substack</a></li>
<li><a href="https://discord.gg/HhSZVvWDBb" rel="nofollow">Joe&#39;s Practical Data Modeling Server on Discord</a></li>
<li><a href="https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA" rel="nofollow">Vanishing Gradients YouTube Channel</a><br></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a></li>
</ul>

<p>🎓 Want to go deeper?<br>
Check out my course: <em>Building LLM Applications for Data Scientists and Software Engineers.</em><br>
Learn how to design, test, and deploy production-grade LLM systems — with observability, feedback loops, and structure built in.<br>
This isn’t about vibes or fragile agents. It’s about making LLMs reliable, testable, and actually useful.</p>

<p>Includes over $800 in compute credits and guest lectures from experts at DeepMind, Moderna, and more.<br>
Cohort starts July 8 — <a href="https://maven.com/hugo-stefan/building-llm-apps-ds-and-swe-from-first-principles?promoCode=LLM10" rel="nofollow">Use this link for a 10% discount</a></p>]]>
  </itunes:summary>
</item>
<item>
  <title>Episode 46: Software Composition Is the New Vibe Coding</title>
  <link>https://vanishinggradients.fireside.fm/46</link>
  <guid isPermaLink="false">dcb8396f-ece2-4636-951c-8ad44d698d15</guid>
  <pubDate>Thu, 03 Apr 2025 13:00:00 +1100</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/dcb8396f-ece2-4636-951c-8ad44d698d15.mp3" length="99299288" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:season>1</itunes:season>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>What if building software felt more like composing than coding?

In this episode, Hugo and Greg explore how LLMs are reshaping the way we think about software development—from deterministic programming to a more flexible, prompt-driven, and collaborative style of building. It’s not just hype or grift—it’s a real shift in how we express intent, reason about systems, and collaborate across roles.

Hugo speaks with Greg Ceccarelli—co-founder of SpecStory, former CPO at Pluralsight, and Director of Data Science at GitHub—about the rise of software composition and how it changes the way individuals and teams create with LLMs.</itunes:subtitle>
  <itunes:duration>1:08:57</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>What if building software felt more like composing than coding?
In this episode, Hugo and Greg explore how LLMs are reshaping the way we think about software development—from deterministic programming to a more flexible, prompt-driven, and collaborative style of building. It’s not just hype or grift—it’s a real shift in how we express intent, reason about systems, and collaborate across roles.
Hugo speaks with Greg Ceccarelli—co-founder of SpecStory, former CPO at Pluralsight, and Director of Data Science at GitHub—about the rise of software composition and how it changes the way individuals and teams create with LLMs.
We dive into:
- Why software composition is emerging as a serious alternative to traditional coding
- The real difference between vibe coding and production-minded prototyping
- How LLMs are expanding who gets to build software—and how
- What changes when you focus on intent, not just code
- What Greg is building with SpecStory to support collaborative, traceable AI-native workflows
- The challenges (and joys) of debugging and exploring with agentic tools like Cursor and Claude
We’ve removed the visual demos from the audio—but you can catch our live-coded Chrome extension and JFK document explorer on YouTube. Links below.
JFK Docs Vibe Coding Demo (YouTube) (https://youtu.be/JpXCkuV58QE)  
Chrome Extension Vibe Coding Demo (YouTube) (https://youtu.be/ESVKp37jDwc)  
Meditations on Tech (Greg’s Substack) (https://www.meditationsontech.com/)  
Simon Willison on Vibe Coding (https://simonwillison.net/2025/Mar/19/vibe-coding/)  
Johnno Whitaker: On Vibe Coding (https://johnowhitaker.dev/essays/vibe_coding.html)  
Tim O’Reilly – The End of Programming (https://www.oreilly.com/radar/the-end-of-programming-as-we-know-it/)  
Vanishing Gradients YouTube Channel (https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA)  
Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk)  
Greg Ceccarelli on LinkedIn (https://www.linkedin.com/in/gregceccarelli/)  
Greg’s Hacker News Post on GOOD (https://news.ycombinator.com/item?id=43557698)  
SpecStory: GOOD – Git Companion for AI Workflows (https://github.com/specstoryai/getspecstory/blob/main/GOOD.md)
🎓 Want to go deeper?
Check out my course: Building LLM Applications for Data Scientists and Software Engineers.
Learn how to design, test, and deploy production-grade LLM systems — with observability, feedback loops, and structure built in.
This isn’t about vibes or fragile agents. It’s about making LLMs reliable, testable, and actually useful.
Includes over $2,500 in compute credits and guest lectures from experts at DeepMind, Moderna, and more.
Cohort starts April 7 — Use this link for a 10% discount (https://maven.com/hugo-stefan/building-llm-apps-ds-and-swe-from-first-principles?promoCode=LLM10)
🔍 Want to help shape the future of SpecStory?
Greg and the team are looking for design partners for their new SpecStory Teams product—built for collaborative, AI-native software development.
If you're working with LLMs in a team setting and want to influence the next wave of developer tools, you can apply here:  
👉 specstory.com/teams (https://specstory.com/teams) 
</description>
  <itunes:keywords>AI, LLMs, data science, machine learning, data science, GenAI, vibe coding</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>What if building software felt more like composing than coding?</p>

<p>In this episode, Hugo and Greg explore how LLMs are reshaping the way we think about software development—from deterministic programming to a more flexible, prompt-driven, and collaborative style of building. It’s not just hype or grift—it’s a real shift in how we express intent, reason about systems, and collaborate across roles.</p>

<p>Hugo speaks with Greg Ceccarelli—co-founder of SpecStory, former CPO at Pluralsight, and Director of Data Science at GitHub—about the rise of software composition and how it changes the way individuals and teams create with LLMs.</p>

<p>We dive into:</p>

<ul>
<li>Why software composition is emerging as a serious alternative to traditional coding</li>
<li>The real difference between vibe coding and production-minded prototyping</li>
<li>How LLMs are expanding who gets to build software—and how</li>
<li>What changes when you focus on intent, not just code</li>
<li>What Greg is building with SpecStory to support collaborative, traceable AI-native workflows</li>
<li>The challenges (and joys) of debugging and exploring with agentic tools like Cursor and Claude</li>
</ul>

<p>We’ve removed the visual demos from the audio—but you can catch our live-coded Chrome extension and JFK document explorer on YouTube. Links below.</p>

<ul>
<li><a href="https://youtu.be/JpXCkuV58QE" rel="nofollow">JFK Docs Vibe Coding Demo (YouTube)</a><br></li>
<li><a href="https://youtu.be/ESVKp37jDwc" rel="nofollow">Chrome Extension Vibe Coding Demo (YouTube)</a><br></li>
<li><a href="https://www.meditationsontech.com/" rel="nofollow">Meditations on Tech (Greg’s Substack)</a><br></li>
<li><a href="https://simonwillison.net/2025/Mar/19/vibe-coding/" rel="nofollow">Simon Willison on Vibe Coding</a><br></li>
<li><a href="https://johnowhitaker.dev/essays/vibe_coding.html" rel="nofollow">Johnno Whitaker: On Vibe Coding</a><br></li>
<li><a href="https://www.oreilly.com/radar/the-end-of-programming-as-we-know-it/" rel="nofollow">Tim O’Reilly – The End of Programming</a><br></li>
<li><a href="https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA" rel="nofollow">Vanishing Gradients YouTube Channel</a><br></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a><br></li>
<li><a href="https://www.linkedin.com/in/gregceccarelli/" rel="nofollow">Greg Ceccarelli on LinkedIn</a><br></li>
<li><a href="https://news.ycombinator.com/item?id=43557698" rel="nofollow">Greg’s Hacker News Post on GOOD</a><br></li>
<li><a href="https://github.com/specstoryai/getspecstory/blob/main/GOOD.md" rel="nofollow">SpecStory: GOOD – Git Companion for AI Workflows</a></li>
</ul>

<p>🎓 Want to go deeper?<br>
Check out my course: <em>Building LLM Applications for Data Scientists and Software Engineers.</em><br>
Learn how to design, test, and deploy production-grade LLM systems — with observability, feedback loops, and structure built in.<br>
This isn’t about vibes or fragile agents. It’s about making LLMs reliable, testable, and actually useful.</p>

<p>Includes over $2,500 in compute credits and guest lectures from experts at DeepMind, Moderna, and more.<br>
Cohort starts April 7 — <a href="https://maven.com/hugo-stefan/building-llm-apps-ds-and-swe-from-first-principles?promoCode=LLM10" rel="nofollow">Use this link for a 10% discount</a></p>

<h3>🔍 Want to help shape the future of SpecStory?</h3>

<p>Greg and the team are looking for <strong>design partners</strong> for their new SpecStory Teams product—built for collaborative, AI-native software development.</p>

<p>If you&#39;re working with LLMs in a team setting and want to influence the next wave of developer tools, you can apply here:<br><br>
👉 <a href="https://specstory.com/teams" rel="nofollow">specstory.com/teams</a></p>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>What if building software felt more like composing than coding?</p>

<p>In this episode, Hugo and Greg explore how LLMs are reshaping the way we think about software development—from deterministic programming to a more flexible, prompt-driven, and collaborative style of building. It’s not just hype or grift—it’s a real shift in how we express intent, reason about systems, and collaborate across roles.</p>

<p>Hugo speaks with Greg Ceccarelli—co-founder of SpecStory, former CPO at Pluralsight, and Director of Data Science at GitHub—about the rise of software composition and how it changes the way individuals and teams create with LLMs.</p>

<p>We dive into:</p>

<ul>
<li>Why software composition is emerging as a serious alternative to traditional coding</li>
<li>The real difference between vibe coding and production-minded prototyping</li>
<li>How LLMs are expanding who gets to build software—and how</li>
<li>What changes when you focus on intent, not just code</li>
<li>What Greg is building with SpecStory to support collaborative, traceable AI-native workflows</li>
<li>The challenges (and joys) of debugging and exploring with agentic tools like Cursor and Claude</li>
</ul>

<p>We’ve removed the visual demos from the audio—but you can catch our live-coded Chrome extension and JFK document explorer on YouTube. Links below.</p>

<ul>
<li><a href="https://youtu.be/JpXCkuV58QE" rel="nofollow">JFK Docs Vibe Coding Demo (YouTube)</a><br></li>
<li><a href="https://youtu.be/ESVKp37jDwc" rel="nofollow">Chrome Extension Vibe Coding Demo (YouTube)</a><br></li>
<li><a href="https://www.meditationsontech.com/" rel="nofollow">Meditations on Tech (Greg’s Substack)</a><br></li>
<li><a href="https://simonwillison.net/2025/Mar/19/vibe-coding/" rel="nofollow">Simon Willison on Vibe Coding</a><br></li>
<li><a href="https://johnowhitaker.dev/essays/vibe_coding.html" rel="nofollow">Johnno Whitaker: On Vibe Coding</a><br></li>
<li><a href="https://www.oreilly.com/radar/the-end-of-programming-as-we-know-it/" rel="nofollow">Tim O’Reilly – The End of Programming</a><br></li>
<li><a href="https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA" rel="nofollow">Vanishing Gradients YouTube Channel</a><br></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Upcoming Events on Luma</a><br></li>
<li><a href="https://www.linkedin.com/in/gregceccarelli/" rel="nofollow">Greg Ceccarelli on LinkedIn</a><br></li>
<li><a href="https://news.ycombinator.com/item?id=43557698" rel="nofollow">Greg’s Hacker News Post on GOOD</a><br></li>
<li><a href="https://github.com/specstoryai/getspecstory/blob/main/GOOD.md" rel="nofollow">SpecStory: GOOD – Git Companion for AI Workflows</a></li>
</ul>

<p>🎓 Want to go deeper?<br>
Check out my course: <em>Building LLM Applications for Data Scientists and Software Engineers.</em><br>
Learn how to design, test, and deploy production-grade LLM systems — with observability, feedback loops, and structure built in.<br>
This isn’t about vibes or fragile agents. It’s about making LLMs reliable, testable, and actually useful.</p>

<p>Includes over $2,500 in compute credits and guest lectures from experts at DeepMind, Moderna, and more.<br>
Cohort starts April 7 — <a href="https://maven.com/hugo-stefan/building-llm-apps-ds-and-swe-from-first-principles?promoCode=LLM10" rel="nofollow">Use this link for a 10% discount</a></p>

<h3>🔍 Want to help shape the future of SpecStory?</h3>

<p>Greg and the team are looking for <strong>design partners</strong> for their new SpecStory Teams product—built for collaborative, AI-native software development.</p>

<p>If you&#39;re working with LLMs in a team setting and want to influence the next wave of developer tools, you can apply here:<br><br>
👉 <a href="https://specstory.com/teams" rel="nofollow">specstory.com/teams</a></p>]]>
  </itunes:summary>
</item>
<item>
  <title>Episode 44: The Future of AI Coding Assistants: Who’s Really in Control?</title>
  <link>https://vanishinggradients.fireside.fm/44</link>
  <guid isPermaLink="false">78988fdd-0e05-4e24-82dd-c0a406dd12a1</guid>
  <pubDate>Tue, 04 Feb 2025 13:00:00 +1100</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/78988fdd-0e05-4e24-82dd-c0a406dd12a1.mp3" length="90430405" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:season>1</itunes:season>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>AI coding assistants are reshaping how developers write, debug, and maintain code—but who’s really in control? In this episode, Hugo speaks with **Tyler Dunn**, CEO and co-founder of **Continue**, an open-source AI-powered code assistant that gives developers more customization and flexibility in their workflows.</itunes:subtitle>
  <itunes:duration>1:34:11</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>AI coding assistants are reshaping how developers write, debug, and maintain code—but who’s really in control? In this episode, Hugo speaks with Tyler Dunn, CEO and co-founder of Continue, an open-source AI-powered code assistant that gives developers more customization and flexibility in their workflows.
In this episode, we dive into:
- The trade-offs between proprietary vs. open-source AI coding assistants—why open-source might be the future.
- How structured workflows, modular AI, and customization help developers maintain control over their tools.
- The evolution of AI-powered coding, from autocomplete to intelligent code suggestions and beyond.
- Why the best developer experiences come from sensible defaults with room for deeper configuration.
- The future of LLM-based software engineering, where fine-tuning models on personal and team-level data could make AI coding assistants even more effective.
With companies increasingly integrating AI into development workflows, this conversation explores the real impact of these tools—and the importance of keeping developers in the driver's seat.
LINKS
The podcast livestream on YouTube (https://youtube.com/live/8QEgVCzm46U?feature=share)
Continue's website (https://www.continue.dev/)
Continue is hiring! (https://www.continue.dev/about-us)
amplified.dev: We believe in a future where developers are amplified, not automated (https://amplified.dev/)
Beyond Prompt and Pray, Building Reliable LLM-Powered Software in an Agentic World (https://www.oreilly.com/radar/beyond-prompt-and-pray/)
LLMOps Lessons Learned: Navigating the Wild West of Production LLMs 🚀 (https://www.zenml.io/blog/llmops-lessons-learned-navigating-the-wild-west-of-production-llms)
Building effective agents by Erik Schluntz and Barry Zhang, Anthropic (https://www.anthropic.com/research/building-effective-agents)
Ty on LinkedIn (https://www.linkedin.com/in/tylerjdunn/)
Hugo on twitter (https://x.com/hugobowne)
Vanishing Gradients on twitter (https://x.com/vanishingdata)
Vanishing Gradients on YouTube (https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA)
Vanishing Gradients on Twitter (https://x.com/vanishingdata)
Vanishing Gradients on Lu.ma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk) 
</description>
  <itunes:keywords>data science, machine learning, AI, LLMs</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>AI coding assistants are reshaping how developers write, debug, and maintain code—but who’s really in control? In this episode, Hugo speaks with <strong>Tyler Dunn</strong>, CEO and co-founder of <strong>Continue</strong>, an open-source AI-powered code assistant that gives developers more customization and flexibility in their workflows.</p>

<p>In this episode, we dive into:</p>

<ul>
<li>The trade-offs between <strong>proprietary vs. open-source AI coding assistants</strong>—why open-source might be the future.</li>
<li>How structured workflows, modular AI, and customization help developers maintain <strong>control over their tools</strong>.</li>
<li>The evolution of AI-powered coding, from <strong>autocomplete to intelligent code suggestions</strong> and beyond.</li>
<li>Why the best developer experiences come from <strong>sensible defaults</strong> with room for deeper configuration.</li>
<li>The future of <strong>LLM-based software engineering</strong>, where fine-tuning models on personal and team-level data could make AI coding assistants even more effective.</li>
</ul>

<p>With companies increasingly integrating AI into development workflows, this conversation explores the real impact of these tools—and the importance of keeping developers in the driver&#39;s seat.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://youtube.com/live/8QEgVCzm46U?feature=share" rel="nofollow">The podcast livestream on YouTube</a></li>
<li><a href="https://www.continue.dev/" rel="nofollow">Continue&#39;s website</a></li>
<li><a href="https://www.continue.dev/about-us" rel="nofollow">Continue is hiring!</a></li>
<li><a href="https://amplified.dev/" rel="nofollow">amplified.dev: We believe in a future where developers are amplified, not automated</a></li>
<li><a href="https://www.oreilly.com/radar/beyond-prompt-and-pray/" rel="nofollow">Beyond Prompt and Pray, Building Reliable LLM-Powered Software in an Agentic World</a></li>
<li><a href="https://www.zenml.io/blog/llmops-lessons-learned-navigating-the-wild-west-of-production-llms" rel="nofollow">LLMOps Lessons Learned: Navigating the Wild West of Production LLMs 🚀</a></li>
<li><a href="https://www.anthropic.com/research/building-effective-agents" rel="nofollow">Building effective agents by Erik Schluntz and Barry Zhang, Anthropic</a></li>
<li><a href="https://www.linkedin.com/in/tylerjdunn/" rel="nofollow">Ty on LinkedIn</a></li>
<li><a href="https://x.com/hugobowne" rel="nofollow">Hugo on twitter</a></li>
<li><a href="https://x.com/vanishingdata" rel="nofollow">Vanishing Gradients on twitter</a></li>
<li><a href="https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA" rel="nofollow">Vanishing Gradients on YouTube</a></li>
<li><a href="https://x.com/vanishingdata" rel="nofollow">Vanishing Gradients on Twitter</a></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Vanishing Gradients on Lu.ma</a></li>
</ul>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>AI coding assistants are reshaping how developers write, debug, and maintain code—but who’s really in control? In this episode, Hugo speaks with <strong>Tyler Dunn</strong>, CEO and co-founder of <strong>Continue</strong>, an open-source AI-powered code assistant that gives developers more customization and flexibility in their workflows.</p>

<p>In this episode, we dive into:</p>

<ul>
<li>The trade-offs between <strong>proprietary vs. open-source AI coding assistants</strong>—why open-source might be the future.</li>
<li>How structured workflows, modular AI, and customization help developers maintain <strong>control over their tools</strong>.</li>
<li>The evolution of AI-powered coding, from <strong>autocomplete to intelligent code suggestions</strong> and beyond.</li>
<li>Why the best developer experiences come from <strong>sensible defaults</strong> with room for deeper configuration.</li>
<li>The future of <strong>LLM-based software engineering</strong>, where fine-tuning models on personal and team-level data could make AI coding assistants even more effective.</li>
</ul>

<p>With companies increasingly integrating AI into development workflows, this conversation explores the real impact of these tools—and the importance of keeping developers in the driver&#39;s seat.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://youtube.com/live/8QEgVCzm46U?feature=share" rel="nofollow">The podcast livestream on YouTube</a></li>
<li><a href="https://www.continue.dev/" rel="nofollow">Continue&#39;s website</a></li>
<li><a href="https://www.continue.dev/about-us" rel="nofollow">Continue is hiring!</a></li>
<li><a href="https://amplified.dev/" rel="nofollow">amplified.dev: We believe in a future where developers are amplified, not automated</a></li>
<li><a href="https://www.oreilly.com/radar/beyond-prompt-and-pray/" rel="nofollow">Beyond Prompt and Pray, Building Reliable LLM-Powered Software in an Agentic World</a></li>
<li><a href="https://www.zenml.io/blog/llmops-lessons-learned-navigating-the-wild-west-of-production-llms" rel="nofollow">LLMOps Lessons Learned: Navigating the Wild West of Production LLMs 🚀</a></li>
<li><a href="https://www.anthropic.com/research/building-effective-agents" rel="nofollow">Building effective agents by Erik Schluntz and Barry Zhang, Anthropic</a></li>
<li><a href="https://www.linkedin.com/in/tylerjdunn/" rel="nofollow">Ty on LinkedIn</a></li>
<li><a href="https://x.com/hugobowne" rel="nofollow">Hugo on twitter</a></li>
<li><a href="https://x.com/vanishingdata" rel="nofollow">Vanishing Gradients on twitter</a></li>
<li><a href="https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA" rel="nofollow">Vanishing Gradients on YouTube</a></li>
<li><a href="https://x.com/vanishingdata" rel="nofollow">Vanishing Gradients on Twitter</a></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Vanishing Gradients on Lu.ma</a></li>
</ul>]]>
  </itunes:summary>
</item>
<item>
  <title>Episode 43: Tales from 400+ LLM Deployments: Building Reliable AI Agents in Production</title>
  <link>https://vanishinggradients.fireside.fm/43</link>
  <guid isPermaLink="false">ff9906ad-8576-40c7-9e0f-26dff301e52c</guid>
  <pubDate>Fri, 17 Jan 2025 08:00:00 +1100</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/ff9906ad-8576-40c7-9e0f-26dff301e52c.mp3" length="58615769" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:season>1</itunes:season>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>Hugo speaks with Alex Strick van Linschoten, Machine Learning Engineer at ZenML and creator of a comprehensive LLMOps database documenting over 400 deployments. Alex's extensive research into real-world LLM implementations gives him unique insight into what actually works—and what doesn't—when deploying AI agents in production.</itunes:subtitle>
  <itunes:duration>1:01:03</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>Hugo speaks with Alex Strick van Linschoten, Machine Learning Engineer at ZenML and creator of a comprehensive LLMOps database documenting over 400 deployments. Alex's extensive research into real-world LLM implementations gives him unique insight into what actually works—and what doesn't—when deploying AI agents in production.
In this episode, we dive into:
- The current state of AI agents in production, from successes to common failure modes
- Practical lessons learned from analyzing hundreds of real-world LLM deployments
- How companies like Anthropic, Klarna, and Dropbox are using patterns like ReAct, RAG, and microservices to build reliable systems
- The evolution of LLM capabilities, from expanding context windows to multimodal applications
- Why most companies still prefer structured workflows over fully autonomous agents
We also explore real-world case studies of production hurdles, including cascading failures, API misfires, and hallucination challenges. Alex shares concrete strategies for integrating LLMs into your pipelines while maintaining reliability and control.
Whether you're scaling agents or building LLM-powered systems, this episode offers practical insights for navigating the complex landscape of LLMOps in 2025.
LINKS
The podcast livestream on YouTube (https://youtube.com/live/-8Gr9fVVX9g?feature=share)
The LLMOps database (https://www.zenml.io/llmops-database)
All blog posts about the database (https://www.zenml.io/category/llmops)
Anthropic's Building effective agents essay (https://www.anthropic.com/research/building-effective-agents)
Alex on LinkedIn (https://www.linkedin.com/in/strickvl/)
Hugo on twitter (https://x.com/hugobowne)
Vanishing Gradients on twitter (https://x.com/vanishingdata)
Vanishing Gradients on YouTube (https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA)
Vanishing Gradients on Twitter (https://x.com/vanishingdata)
Vanishing Gradients on Lu.ma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk)
</description>
  <itunes:keywords>data science, machine learning, AI, LLMs</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>Hugo speaks with Alex Strick van Linschoten, Machine Learning Engineer at ZenML and creator of a comprehensive LLMOps database documenting over 400 deployments. Alex&#39;s extensive research into real-world LLM implementations gives him unique insight into what actually works—and what doesn&#39;t—when deploying AI agents in production.</p>

<p>In this episode, we dive into:</p>

<ul>
<li>The current state of AI agents in production, from successes to common failure modes</li>
<li>Practical lessons learned from analyzing hundreds of real-world LLM deployments</li>
<li>How companies like Anthropic, Klarna, and Dropbox are using patterns like ReAct, RAG, and microservices to build reliable systems</li>
<li>The evolution of LLM capabilities, from expanding context windows to multimodal applications</li>
<li>Why most companies still prefer structured workflows over fully autonomous agents</li>
</ul>

<p>We also explore real-world case studies of production hurdles, including cascading failures, API misfires, and hallucination challenges. Alex shares concrete strategies for integrating LLMs into your pipelines while maintaining reliability and control.</p>

<p>Whether you&#39;re scaling agents or building LLM-powered systems, this episode offers practical insights for navigating the complex landscape of LLMOps in 2025.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://youtube.com/live/-8Gr9fVVX9g?feature=share" rel="nofollow">The podcast livestream on YouTube</a></li>
<li><a href="https://www.zenml.io/llmops-database" rel="nofollow">The LLMOps database</a></li>
<li><a href="https://www.zenml.io/category/llmops" rel="nofollow">All blog posts about the database</a></li>
<li><a href="https://www.anthropic.com/research/building-effective-agents" rel="nofollow">Anthropic&#39;s Building effective agents essay</a></li>
<li><a href="https://www.linkedin.com/in/strickvl/" rel="nofollow">Alex on LinkedIn</a></li>
<li><a href="https://x.com/hugobowne" rel="nofollow">Hugo on twitter</a></li>
<li><a href="https://x.com/vanishingdata" rel="nofollow">Vanishing Gradients on twitter</a></li>
<li><a href="https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA" rel="nofollow">Vanishing Gradients on YouTube</a></li>
<li><a href="https://x.com/vanishingdata" rel="nofollow">Vanishing Gradients on Twitter</a></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Vanishing Gradients on Lu.ma</a></li>
</ul>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>Hugo speaks with Alex Strick van Linschoten, Machine Learning Engineer at ZenML and creator of a comprehensive LLMOps database documenting over 400 deployments. Alex&#39;s extensive research into real-world LLM implementations gives him unique insight into what actually works—and what doesn&#39;t—when deploying AI agents in production.</p>

<p>In this episode, we dive into:</p>

<ul>
<li>The current state of AI agents in production, from successes to common failure modes</li>
<li>Practical lessons learned from analyzing hundreds of real-world LLM deployments</li>
<li>How companies like Anthropic, Klarna, and Dropbox are using patterns like ReAct, RAG, and microservices to build reliable systems</li>
<li>The evolution of LLM capabilities, from expanding context windows to multimodal applications</li>
<li>Why most companies still prefer structured workflows over fully autonomous agents</li>
</ul>

<p>We also explore real-world case studies of production hurdles, including cascading failures, API misfires, and hallucination challenges. Alex shares concrete strategies for integrating LLMs into your pipelines while maintaining reliability and control.</p>

<p>Whether you&#39;re scaling agents or building LLM-powered systems, this episode offers practical insights for navigating the complex landscape of LLMOps in 2025.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://youtube.com/live/-8Gr9fVVX9g?feature=share" rel="nofollow">The podcast livestream on YouTube</a></li>
<li><a href="https://www.zenml.io/llmops-database" rel="nofollow">The LLMOps database</a></li>
<li><a href="https://www.zenml.io/category/llmops" rel="nofollow">All blog posts about the database</a></li>
<li><a href="https://www.anthropic.com/research/building-effective-agents" rel="nofollow">Anthropic&#39;s Building effective agents essay</a></li>
<li><a href="https://www.linkedin.com/in/strickvl/" rel="nofollow">Alex on LinkedIn</a></li>
<li><a href="https://x.com/hugobowne" rel="nofollow">Hugo on twitter</a></li>
<li><a href="https://x.com/vanishingdata" rel="nofollow">Vanishing Gradients on twitter</a></li>
<li><a href="https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA" rel="nofollow">Vanishing Gradients on YouTube</a></li>
<li><a href="https://x.com/vanishingdata" rel="nofollow">Vanishing Gradients on Twitter</a></li>
<li><a href="https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk" rel="nofollow">Vanishing Gradients on Lu.ma</a></li>
</ul>]]>
  </itunes:summary>
</item>
<item>
  <title>Episode 42: Learning, Teaching, and Building in the Age of AI</title>
  <link>https://vanishinggradients.fireside.fm/42</link>
  <guid isPermaLink="false">6af2e172-b72b-418b-baa6-369299f37b8b</guid>
  <pubDate>Sat, 04 Jan 2025 14:00:00 +1100</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/6af2e172-b72b-418b-baa6-369299f37b8b.mp3" length="76860106" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:season>1</itunes:season>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>The tables turn as Hugo sits down with Alex Andorra, host of Learning Bayesian Statistics. Hugo shares his journey from mathematics to AI, reflecting on how Bayesian inference shapes his approach to data science, teaching, and building AI-powered applications.</itunes:subtitle>
  <itunes:duration>1:20:03</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>In this episode of Vanishing Gradients, the tables turn as Hugo sits down with Alex Andorra, host of Learning Bayesian Statistics. Hugo shares his journey from mathematics to AI, reflecting on how Bayesian inference shapes his approach to data science, teaching, and building AI-powered applications.
They dive into the realities of deploying LLM applications, overcoming “proof-of-concept purgatory,” and why first principles and iteration are critical for success in AI. Whether you’re an educator, software engineer, or data scientist, this episode offers valuable insights into the intersection of AI, product development, and real-world deployment.
LINKS
The podcast on YouTube (https://www.youtube.com/watch?v=BRIYytbqtP0)
The original podcast episode (https://learnbayesstats.com/episode/122-learning-and-teaching-in-the-age-of-ai-hugo-bowne-anderson)
Alex Andorra on LinkedIn (https://www.linkedin.com/in/alex-andorra/)
Hugo on LinkedIn (https://www.linkedin.com/in/hugo-bowne-anderson-045939a5/)
Hugo on twitter (https://x.com/hugobowne)
Vanishing Gradients on twitter (https://x.com/vanishingdata)
Hugo's "Building LLM Applications for Data Scientists and Software Engineers" course (https://maven.com/s/course/d56067f338) 
</description>
  <itunes:keywords>data science, machine learning, AI, LLMs</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>In this episode of Vanishing Gradients, the tables turn as Hugo sits down with Alex Andorra, host of Learning Bayesian Statistics. Hugo shares his journey from mathematics to AI, reflecting on how Bayesian inference shapes his approach to data science, teaching, and building AI-powered applications.</p>

<p>They dive into the realities of deploying LLM applications, overcoming “proof-of-concept purgatory,” and why first principles and iteration are critical for success in AI. Whether you’re an educator, software engineer, or data scientist, this episode offers valuable insights into the intersection of AI, product development, and real-world deployment.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://www.youtube.com/watch?v=BRIYytbqtP0" rel="nofollow">The podcast on YouTube</a></li>
<li><a href="https://learnbayesstats.com/episode/122-learning-and-teaching-in-the-age-of-ai-hugo-bowne-anderson" rel="nofollow">The original podcast episode</a></li>
<li><a href="https://www.linkedin.com/in/alex-andorra/" rel="nofollow">Alex Andorra on LinkedIn</a></li>
<li><a href="https://www.linkedin.com/in/hugo-bowne-anderson-045939a5/" rel="nofollow">Hugo on LinkedIn</a></li>
<li><a href="https://x.com/hugobowne" rel="nofollow">Hugo on twitter</a></li>
<li><a href="https://x.com/vanishingdata" rel="nofollow">Vanishing Gradients on twitter</a></li>
<li><a href="https://maven.com/s/course/d56067f338" rel="nofollow">Hugo&#39;s &quot;Building LLM Applications for Data Scientists and Software Engineers&quot; course</a></li>
</ul>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>In this episode of Vanishing Gradients, the tables turn as Hugo sits down with Alex Andorra, host of Learning Bayesian Statistics. Hugo shares his journey from mathematics to AI, reflecting on how Bayesian inference shapes his approach to data science, teaching, and building AI-powered applications.</p>

<p>They dive into the realities of deploying LLM applications, overcoming “proof-of-concept purgatory,” and why first principles and iteration are critical for success in AI. Whether you’re an educator, software engineer, or data scientist, this episode offers valuable insights into the intersection of AI, product development, and real-world deployment.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://www.youtube.com/watch?v=BRIYytbqtP0" rel="nofollow">The podcast on YouTube</a></li>
<li><a href="https://learnbayesstats.com/episode/122-learning-and-teaching-in-the-age-of-ai-hugo-bowne-anderson" rel="nofollow">The original podcast episode</a></li>
<li><a href="https://www.linkedin.com/in/alex-andorra/" rel="nofollow">Alex Andorra on LinkedIn</a></li>
<li><a href="https://www.linkedin.com/in/hugo-bowne-anderson-045939a5/" rel="nofollow">Hugo on LinkedIn</a></li>
<li><a href="https://x.com/hugobowne" rel="nofollow">Hugo on twitter</a></li>
<li><a href="https://x.com/vanishingdata" rel="nofollow">Vanishing Gradients on twitter</a></li>
<li><a href="https://maven.com/s/course/d56067f338" rel="nofollow">Hugo&#39;s &quot;Building LLM Applications for Data Scientists and Software Engineers&quot; course</a></li>
</ul>]]>
  </itunes:summary>
</item>
<item>
  <title>Episode 40: What Every LLM Developer Needs to Know About GPUs</title>
  <link>https://vanishinggradients.fireside.fm/40</link>
  <guid isPermaLink="false">b1b66484-5fd0-4bcb-91cb-8bf7201a5ded</guid>
  <pubDate>Tue, 24 Dec 2024 15:00:00 +1100</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/b1b66484-5fd0-4bcb-91cb-8bf7201a5ded.mp3" length="99441605" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:season>1</itunes:season>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>Hugo speaks with **Charles Frye**, Developer Advocate at Modal and someone who really knows GPUs inside and out. If you’re a data scientist, machine learning engineer, AI researcher, or just someone trying to make sense of **hardware for LLMs and AI workflows**, this episode is for you.  

Charles and Hugo dive into the **practical side of GPUs**—from **running inference** on large models, to **fine-tuning** and even **training from scratch.** </itunes:subtitle>
  <itunes:duration>1:43:34</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>Hugo speaks with Charles Frye, Developer Advocate at Modal and someone who really knows GPUs inside and out. If you’re a data scientist, machine learning engineer, AI researcher, or just someone trying to make sense of hardware for LLMs and AI workflows, this episode is for you.  
Charles and Hugo dive into the practical side of GPUs—from running inference on large models, to fine-tuning and even training from scratch. They unpack the real pain points developers face, like figuring out:  
- How much VRAM you actually need.  
- Why memory—not compute—ends up being the bottleneck.  
- How to make quick, back-of-the-envelope calculations to size up hardware for your tasks.  
- And where things like fine-tuning, quantization, and retrieval-augmented generation (RAG) fit into the mix.  
One thing Hugo really appreciate is that Charles and the Modal team recently put together the GPU Glossary—a resource that breaks down GPU internals in a way that’s actually useful for developers. We reference it a few times throughout the episode, so check it out in the show notes below.  
🔧 Charles also does a demo during the episode—some of it is visual, but we talk through the key points so you’ll still get value from the audio. If you’d like to see the demo in action, check out the livestream linked below.
This is the "Building LLM Applications for Data Scientists and Software Engineers" course that Hugo is teaching with Stefan Krawczyk (ex-StitchFix) in January (https://maven.com/s/course/d56067f338). Charles is giving a guest lecture at on hardware for LLMs, and Modal is giving all students $1K worth of compute credits (use the code VG25 for $200 off).
LINKS
The livestream on YouTube (https://www.youtube.com/live/INryb8Hjk3c?si=0cbb0-Nxem1P987d)
The GPU Glossary (https://modal.com/gpu-glossary) by the Modal team
What We’ve Learned From A Year of Building with LLMs (https://applied-llms.org/) by Charles and friends
Charles on twitter (https://x.com/charles_irl)
Hugo on twitter (https://x.com/hugobowne)
Vanishing Gradients on twitter (https://x.com/vanishingdata)
</description>
  <itunes:keywords>data science, machine learning, AI, LLMs</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>Hugo speaks with <strong>Charles Frye</strong>, Developer Advocate at Modal and someone who really knows GPUs inside and out. If you’re a data scientist, machine learning engineer, AI researcher, or just someone trying to make sense of <strong>hardware for LLMs and AI workflows</strong>, this episode is for you.  </p>

<p>Charles and Hugo dive into the <strong>practical side of GPUs</strong>—from <strong>running inference</strong> on large models, to <strong>fine-tuning</strong> and even <strong>training from scratch.</strong> They unpack the <strong>real pain points</strong> developers face, like figuring out:  </p>

<ul>
<li>How much VRAM you actually need.<br></li>
<li>Why memory—not compute—ends up being the bottleneck.<br></li>
<li>How to make quick, <strong>back-of-the-envelope calculations</strong> to size up hardware for your tasks.<br></li>
<li>And where things like <strong>fine-tuning, quantization, and retrieval-augmented generation (RAG)</strong> fit into the mix.<br></li>
</ul>

<p>One thing Hugo really appreciate is that Charles and the Modal team recently put together the <strong>GPU Glossary</strong>—a resource that breaks down GPU internals in a way that’s actually useful for developers. We reference it a few times throughout the episode, so check it out in the show notes below.  </p>

<p>🔧 <strong>Charles also does a demo during the episode</strong>—some of it is visual, but we talk through the key points so you’ll still get value from the audio. If you’d like to see the demo in action, check out the livestream linked below.</p>

<p><a href="https://maven.com/s/course/d56067f338" rel="nofollow">This is the &quot;Building LLM Applications for Data Scientists and Software Engineers&quot; course that Hugo is teaching with Stefan Krawczyk (ex-StitchFix) in January</a>. Charles is giving a guest lecture at on hardware for LLMs, and Modal is giving all students $1K worth of compute credits (use the code VG25 for $200 off).</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://www.youtube.com/live/INryb8Hjk3c?si=0cbb0-Nxem1P987d" rel="nofollow">The livestream on YouTube</a></li>
<li><a href="https://modal.com/gpu-glossary" rel="nofollow">The GPU Glossary</a> by the Modal team</li>
<li><a href="https://applied-llms.org/" rel="nofollow">What We’ve Learned From A Year of Building with LLMs</a> by Charles and friends</li>
<li><a href="https://x.com/charles_irl" rel="nofollow">Charles on twitter</a></li>
<li><a href="https://x.com/hugobowne" rel="nofollow">Hugo on twitter</a></li>
<li><a href="https://x.com/vanishingdata" rel="nofollow">Vanishing Gradients on twitter</a></li>
</ul>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>Hugo speaks with <strong>Charles Frye</strong>, Developer Advocate at Modal and someone who really knows GPUs inside and out. If you’re a data scientist, machine learning engineer, AI researcher, or just someone trying to make sense of <strong>hardware for LLMs and AI workflows</strong>, this episode is for you.  </p>

<p>Charles and Hugo dive into the <strong>practical side of GPUs</strong>—from <strong>running inference</strong> on large models, to <strong>fine-tuning</strong> and even <strong>training from scratch.</strong> They unpack the <strong>real pain points</strong> developers face, like figuring out:  </p>

<ul>
<li>How much VRAM you actually need.<br></li>
<li>Why memory—not compute—ends up being the bottleneck.<br></li>
<li>How to make quick, <strong>back-of-the-envelope calculations</strong> to size up hardware for your tasks.<br></li>
<li>And where things like <strong>fine-tuning, quantization, and retrieval-augmented generation (RAG)</strong> fit into the mix.<br></li>
</ul>

<p>One thing Hugo really appreciate is that Charles and the Modal team recently put together the <strong>GPU Glossary</strong>—a resource that breaks down GPU internals in a way that’s actually useful for developers. We reference it a few times throughout the episode, so check it out in the show notes below.  </p>

<p>🔧 <strong>Charles also does a demo during the episode</strong>—some of it is visual, but we talk through the key points so you’ll still get value from the audio. If you’d like to see the demo in action, check out the livestream linked below.</p>

<p><a href="https://maven.com/s/course/d56067f338" rel="nofollow">This is the &quot;Building LLM Applications for Data Scientists and Software Engineers&quot; course that Hugo is teaching with Stefan Krawczyk (ex-StitchFix) in January</a>. Charles is giving a guest lecture at on hardware for LLMs, and Modal is giving all students $1K worth of compute credits (use the code VG25 for $200 off).</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://www.youtube.com/live/INryb8Hjk3c?si=0cbb0-Nxem1P987d" rel="nofollow">The livestream on YouTube</a></li>
<li><a href="https://modal.com/gpu-glossary" rel="nofollow">The GPU Glossary</a> by the Modal team</li>
<li><a href="https://applied-llms.org/" rel="nofollow">What We’ve Learned From A Year of Building with LLMs</a> by Charles and friends</li>
<li><a href="https://x.com/charles_irl" rel="nofollow">Charles on twitter</a></li>
<li><a href="https://x.com/hugobowne" rel="nofollow">Hugo on twitter</a></li>
<li><a href="https://x.com/vanishingdata" rel="nofollow">Vanishing Gradients on twitter</a></li>
</ul>]]>
  </itunes:summary>
</item>
<item>
  <title>Episode 39: From Models to Products: Bridging Research and Practice in Generative AI at Google Labs</title>
  <link>https://vanishinggradients.fireside.fm/39</link>
  <guid isPermaLink="false">bf5453c0-4aa2-4abb-b323-20334f787512</guid>
  <pubDate>Tue, 26 Nov 2024 03:00:00 +1100</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/bf5453c0-4aa2-4abb-b323-20334f787512.mp3" length="99346310" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:season>1</itunes:season>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>From building rockets at SpaceX to advancing generative AI at Google Labs, Ravin Kumar has carved a unique path through the world of technology. In this episode, we explore how to build scalable, reliable AI systems, the skills needed to work across the AI/ML pipeline, and the real-world impact of tools like open-weight models such as Gemma. Ravin also shares insights into designing AI tools like Notebook LM with the user journey at the forefront.</itunes:subtitle>
  <itunes:duration>1:43:28</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>Hugo speaks with Ravin Kumar,*Senior Research Data Scientist at Google Labs. Ravin’s career has taken him from building rockets at SpaceX to driving data science and technology at Sweetgreen, and now to advancing generative AI research and applications at Google Labs and DeepMind. His multidisciplinary experience gives him a rare perspective on building AI systems that combine technical rigor with practical utility.
In this episode, we dive into:
    • Ravin’s fascinating career path, including the skills and mindsets needed to work effectively with AI and machine learning models at different stages of the pipeline.
    • How to build generative AI systems that are scalable, reliable, and aligned with user needs.
    • Real-world applications of generative AI, such as using open weight models such as Gemma to help a bakery streamline operations—an example of delivering tangible business value through AI.
    • The critical role of UX in AI adoption, and how Ravin approaches designing tools like Notebook LM with the user journey in mind.
We also include a live demo where Ravin uses Notebook LM to analyze my website, extract insights, and even generate a podcast-style conversation about me. While some of the demo is visual, much can be appreciated through audio, and we’ve added a link to the video in the show notes for those who want to see it in action. We’ve also included the generated segment at the end of the episode for you to enjoy.
LINKS
The livestream on YouTube (https://www.youtube.com/live/ffS6NWqoo_k)
Google Labs (https://labs.google/)
Ravin's GenAI Handbook (https://ravinkumar.com/GenAiGuidebook/book_intro.html)
Breadboard: A library for prototyping generative AI applications (https://breadboard-ai.github.io/breadboard/)
As mentioned in the episode, Hugo is teaching a four-week course, Building LLM Applications for Data Scientists and SWEs, co-led with Stefan Krawczyk (Dagworks, ex-StitchFix). The course focuses on building scalable, production-grade generative AI systems, with hands-on sessions, $1,000+ in cloud credits, live Q&amp;amp;As, and guest lectures from industry experts.
Listeners of Vanishing Gradients can get 25% off the course using this special link (https://maven.com/hugo-stefan/building-llm-apps-ds-and-swe-from-first-principles?promoCode=VG25) or by applying the code VG25 at checkout.
</description>
  <itunes:keywords>data science, machine learning, AI, LLMs</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>Hugo speaks with Ravin Kumar,*Senior Research Data Scientist at Google Labs. Ravin’s career has taken him from building rockets at SpaceX to driving data science and technology at Sweetgreen, and now to advancing generative AI research and applications at Google Labs and DeepMind. His multidisciplinary experience gives him a rare perspective on building AI systems that combine technical rigor with practical utility.</p>

<p>In this episode, we dive into:<br>
    • Ravin’s fascinating career path, including the skills and mindsets needed to work effectively with AI and machine learning models at different stages of the pipeline.<br>
    • How to build generative AI systems that are scalable, reliable, and aligned with user needs.<br>
    • Real-world applications of generative AI, such as using open weight models such as Gemma to help a bakery streamline operations—an example of delivering tangible business value through AI.<br>
    • The critical role of UX in AI adoption, and how Ravin approaches designing tools like Notebook LM with the user journey in mind.</p>

<p>We also include a live demo where Ravin uses Notebook LM to analyze my website, extract insights, and even generate a podcast-style conversation about me. While some of the demo is visual, much can be appreciated through audio, and we’ve added a link to the video in the show notes for those who want to see it in action. We’ve also included the generated segment at the end of the episode for you to enjoy.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://www.youtube.com/live/ffS6NWqoo_k" rel="nofollow">The livestream on YouTube</a></li>
<li><a href="https://labs.google/" rel="nofollow">Google Labs</a></li>
<li><a href="https://ravinkumar.com/GenAiGuidebook/book_intro.html" rel="nofollow">Ravin&#39;s GenAI Handbook</a></li>
<li><a href="https://breadboard-ai.github.io/breadboard/" rel="nofollow">Breadboard: A library for prototyping generative AI applications</a></li>
</ul>

<p>As mentioned in the episode, Hugo is teaching a four-week course, <strong>Building LLM Applications for Data Scientists and SWEs</strong>, co-led with Stefan Krawczyk (Dagworks, ex-StitchFix). The course focuses on building scalable, production-grade generative AI systems, with hands-on sessions, $1,000+ in cloud credits, live Q&amp;As, and guest lectures from industry experts.</p>

<p>Listeners of Vanishing Gradients can get 25% off the course using <a href="https://maven.com/hugo-stefan/building-llm-apps-ds-and-swe-from-first-principles?promoCode=VG25" rel="nofollow">this special link</a> or by applying the code VG25 at checkout.</p>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>Hugo speaks with Ravin Kumar,*Senior Research Data Scientist at Google Labs. Ravin’s career has taken him from building rockets at SpaceX to driving data science and technology at Sweetgreen, and now to advancing generative AI research and applications at Google Labs and DeepMind. His multidisciplinary experience gives him a rare perspective on building AI systems that combine technical rigor with practical utility.</p>

<p>In this episode, we dive into:<br>
    • Ravin’s fascinating career path, including the skills and mindsets needed to work effectively with AI and machine learning models at different stages of the pipeline.<br>
    • How to build generative AI systems that are scalable, reliable, and aligned with user needs.<br>
    • Real-world applications of generative AI, such as using open weight models such as Gemma to help a bakery streamline operations—an example of delivering tangible business value through AI.<br>
    • The critical role of UX in AI adoption, and how Ravin approaches designing tools like Notebook LM with the user journey in mind.</p>

<p>We also include a live demo where Ravin uses Notebook LM to analyze my website, extract insights, and even generate a podcast-style conversation about me. While some of the demo is visual, much can be appreciated through audio, and we’ve added a link to the video in the show notes for those who want to see it in action. We’ve also included the generated segment at the end of the episode for you to enjoy.</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://www.youtube.com/live/ffS6NWqoo_k" rel="nofollow">The livestream on YouTube</a></li>
<li><a href="https://labs.google/" rel="nofollow">Google Labs</a></li>
<li><a href="https://ravinkumar.com/GenAiGuidebook/book_intro.html" rel="nofollow">Ravin&#39;s GenAI Handbook</a></li>
<li><a href="https://breadboard-ai.github.io/breadboard/" rel="nofollow">Breadboard: A library for prototyping generative AI applications</a></li>
</ul>

<p>As mentioned in the episode, Hugo is teaching a four-week course, <strong>Building LLM Applications for Data Scientists and SWEs</strong>, co-led with Stefan Krawczyk (Dagworks, ex-StitchFix). The course focuses on building scalable, production-grade generative AI systems, with hands-on sessions, $1,000+ in cloud credits, live Q&amp;As, and guest lectures from industry experts.</p>

<p>Listeners of Vanishing Gradients can get 25% off the course using <a href="https://maven.com/hugo-stefan/building-llm-apps-ds-and-swe-from-first-principles?promoCode=VG25" rel="nofollow">this special link</a> or by applying the code VG25 at checkout.</p>]]>
  </itunes:summary>
</item>
<item>
  <title>Episode 18: Research Data Science in Biotech</title>
  <link>https://vanishinggradients.fireside.fm/18</link>
  <guid isPermaLink="false">83afeb64-21ec-4828-bf96-75a08c710391</guid>
  <pubDate>Thu, 25 May 2023 08:00:00 +1000</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/83afeb64-21ec-4828-bf96-75a08c710391.mp3" length="69807439" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:season>1</itunes:season>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>Machine learning, deep learning, Bayesian inference for drug discovery, OSS, and accelerating discovery science to the speed of thought!</itunes:subtitle>
  <itunes:duration>1:12:42</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>Hugo speaks with Eric Ma about Research Data Science in Biotech. Eric leads the Research team in the Data Science and Artificial Intelligence group at Moderna Therapeutics. Prior to that, he was part of a special ops data science team at the Novartis Institutes for Biomedical Research's Informatics department.
In this episode, Hugo and Eric talk about
  What tools and techniques they use for drug discovery (such as mRNA vaccines and medicines);
  The importance of machine learning, deep learning, and Bayesian inference;
  How to think more generally about such high-dimensional, multi-objective optimization problems;
  The importance of open-source software and Python;
  Institutional and cultural questions, including hiring and the trade-offs between being an individual contributor and a manager;
  How they’re approaching accelerating discovery science to the speed of thought using computation, data science, statistics, and ML.
And as always, much, much more!
LINKS
Eric's website (https://ericmjl.github.io/)
Eric on twitter (https://twitter.com/ericmjl)
Vanishing Gradients on YouTube (https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA)
Cell Biology by the Numbers by Ron Milo and Rob Phillips (http://book.bionumbers.org/)
Eric's JAX tutorials at PyCon (https://youtu.be/ztthQJQFe20) and SciPy (https://youtu.be/DmR36wtel4Y)
Eric's blog post on Hiring data scientists at Moderna! (https://ericmjl.github.io/blog/2021/8/26/hiring-data-scientists-at-moderna-2021/) 
</description>
  <itunes:keywords>machine learning, AI, data science, open source, python, biotech</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>Hugo speaks with Eric Ma about Research Data Science in Biotech. Eric leads the Research team in the Data Science and Artificial Intelligence group at Moderna Therapeutics. Prior to that, he was part of a special ops data science team at the Novartis Institutes for Biomedical Research&#39;s Informatics department.</p>

<p>In this episode, Hugo and Eric talk about</p>

<ul>
<li>  What tools and techniques they use for drug discovery (such as mRNA vaccines and medicines);</li>
<li>  The importance of machine learning, deep learning, and Bayesian inference;</li>
<li>  How to think more generally about such high-dimensional, multi-objective optimization problems;</li>
<li>  The importance of open-source software and Python;</li>
<li>  Institutional and cultural questions, including hiring and the trade-offs between being an individual contributor and a manager;</li>
<li>  How they’re approaching accelerating discovery science to the speed of thought using computation, data science, statistics, and ML.</li>
</ul>

<p>And as always, much, much more!</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://ericmjl.github.io/" rel="nofollow">Eric&#39;s website</a></li>
<li><a href="https://twitter.com/ericmjl" rel="nofollow">Eric on twitter</a></li>
<li><a href="https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA" rel="nofollow">Vanishing Gradients on YouTube</a></li>
<li><a href="http://book.bionumbers.org/" rel="nofollow">Cell Biology by the Numbers by Ron Milo and Rob Phillips</a></li>
<li>Eric&#39;s JAX tutorials at <a href="https://youtu.be/ztthQJQFe20" rel="nofollow">PyCon</a> and <a href="https://youtu.be/DmR36wtel4Y" rel="nofollow">SciPy</a></li>
<li>Eric&#39;s blog post on <a href="https://ericmjl.github.io/blog/2021/8/26/hiring-data-scientists-at-moderna-2021/" rel="nofollow">Hiring data scientists at Moderna!</a></li>
</ul>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>Hugo speaks with Eric Ma about Research Data Science in Biotech. Eric leads the Research team in the Data Science and Artificial Intelligence group at Moderna Therapeutics. Prior to that, he was part of a special ops data science team at the Novartis Institutes for Biomedical Research&#39;s Informatics department.</p>

<p>In this episode, Hugo and Eric talk about</p>

<ul>
<li>  What tools and techniques they use for drug discovery (such as mRNA vaccines and medicines);</li>
<li>  The importance of machine learning, deep learning, and Bayesian inference;</li>
<li>  How to think more generally about such high-dimensional, multi-objective optimization problems;</li>
<li>  The importance of open-source software and Python;</li>
<li>  Institutional and cultural questions, including hiring and the trade-offs between being an individual contributor and a manager;</li>
<li>  How they’re approaching accelerating discovery science to the speed of thought using computation, data science, statistics, and ML.</li>
</ul>

<p>And as always, much, much more!</p>

<p><strong>LINKS</strong></p>

<ul>
<li><a href="https://ericmjl.github.io/" rel="nofollow">Eric&#39;s website</a></li>
<li><a href="https://twitter.com/ericmjl" rel="nofollow">Eric on twitter</a></li>
<li><a href="https://www.youtube.com/channel/UC_NafIo-Ku2loOLrzm45ABA" rel="nofollow">Vanishing Gradients on YouTube</a></li>
<li><a href="http://book.bionumbers.org/" rel="nofollow">Cell Biology by the Numbers by Ron Milo and Rob Phillips</a></li>
<li>Eric&#39;s JAX tutorials at <a href="https://youtu.be/ztthQJQFe20" rel="nofollow">PyCon</a> and <a href="https://youtu.be/DmR36wtel4Y" rel="nofollow">SciPy</a></li>
<li>Eric&#39;s blog post on <a href="https://ericmjl.github.io/blog/2021/8/26/hiring-data-scientists-at-moderna-2021/" rel="nofollow">Hiring data scientists at Moderna!</a></li>
</ul>]]>
  </itunes:summary>
</item>
<item>
  <title>Episode 7: The Evolution of Python for Data Science</title>
  <link>https://vanishinggradients.fireside.fm/7</link>
  <guid isPermaLink="false">da4fab18-c5fa-460d-9ddf-0c8f1e60f3f8</guid>
  <pubDate>Mon, 02 May 2022 06:00:00 +1000</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/da4fab18-c5fa-460d-9ddf-0c8f1e60f3f8.mp3" length="60022178" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:season>1</itunes:season>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>Hugo speaks with Peter Wang, CEO of Anaconda, about how Python became so big in data science, machine learning, and AI. They jump into many of the technical and sociological beginnings of Python being used for data science, a history of PyData, the conda distribution, and NUMFOCUS.
</itunes:subtitle>
  <itunes:duration>1:02:31</itunes:duration>
  <itunes:explicit>yes</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>Hugo speaks with Peter Wang, CEO of Anaconda, about how Python became so big in data science, machine learning, and AI. They jump into many of the technical and sociological beginnings of Python being used for data science, a history of PyData, the conda distribution, and NUMFOCUS.
They also talk about the emergence of online collaborative environments, particularly with respect to open source, and attempt to figure out the movings parts of PyData and why it has had the impact it has, including the fact that many core developers were not computer scientists or software engineers, but rather scientists and researchers building tools that they needed on an as-needed basis
They also discuss the challenges in getting adoption for Python and the things that the PyData stack solves, those that it doesn’t and what progress is being made there.
People who have listened to Hugo podcast for some time may have recognized that he's interested in the sociology of the data science space and he really considered speaking with Peter a fascinating opportunity to delve into how the Pythonic data science space evolved, particularly with respect to tooling, not only because Peter had a front row seat for much of it, but that he was one of several key actors at various different points. On top of this, Hugo wanted to allow Peter’s inner sociologist room to breathe and evolve in this conversation. 
What happens then is slightly experimental – Peter is a deep, broad, and occasionally hallucinatory thinker and Hugo wanted to explore new spaces with him so we hope you enjoy the experiments they play as they begin to discuss open-source software in the broader context of finite and infinite games and how OSS is a paradigm of humanity’s ability to create generative, nourishing and anti-rivlarous systems where, by anti-rivalrous, we mean things that become more valuable for everyone the more people use them! But we need to be mindful of finite-game dynamics (for example, those driven by corporate incentives) co-opting and parasitizing the generative systems that we build.
These are all considerations they delve far deeper into in Part 2 of this interview, which will be the next episode of VG, where we also dive into the relationship  between OSS, tools, and venture capital, amonh many others things.
LInks
Peter on twitter (https://twitter.com/pwang)
Anaconda Nucleus (https://anaconda.cloud/)
Calling out SciPy on diversity (even though it hurts) (https://ilovesymposia.com/2015/04/03/calling-out-scipy-on-diversity/) by Juan Nunez-Iglesias
Here Comes Everybody: The Power of Organizing Without Organizations (https://en.wikipedia.org/wiki/Here_Comes_Everybody_(book)) by Clay Shirky
Finite and Infinite Games (https://en.wikipedia.org/wiki/Finite_and_Infinite_Games) by James Carse
Governing the Commons: The Evolution of Institutions for Collective Action (https://www.cambridge.org/core/books/governing-the-commons/7AB7AE11BADA84409C34815CC288CD79) by Elinor Olstrom
Elinor Ostrom's 8 Principles for Managing A Commmons (https://www.onthecommons.org/magazine/elinor-ostroms-8-principles-managing-commmons) 
</description>
  <itunes:keywords>oss, data science, machine learning, python</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>Hugo speaks with Peter Wang, CEO of Anaconda, about how Python became so big in data science, machine learning, and AI. They jump into many of the technical and sociological beginnings of Python being used for data science, a history of PyData, the conda distribution, and NUMFOCUS.</p>

<p>They also talk about the emergence of online collaborative environments, particularly with respect to open source, and attempt to figure out the movings parts of PyData and why it has had the impact it has, including the fact that many core developers were not computer scientists or software engineers, but rather scientists and researchers building tools that they needed on an as-needed basis</p>

<p>They also discuss the challenges in getting adoption for Python and the things that the PyData stack solves, those that it doesn’t and what progress is being made there.</p>

<p>People who have listened to Hugo podcast for some time may have recognized that he&#39;s interested in the sociology of the data science space and he really considered speaking with Peter a fascinating opportunity to delve into how the Pythonic data science space evolved, particularly with respect to tooling, not only because Peter had a front row seat for much of it, but that he was one of several key actors at various different points. On top of this, Hugo wanted to allow Peter’s inner sociologist room to breathe and evolve in this conversation. </p>

<p>What happens then is slightly experimental – Peter is a deep, broad, and occasionally hallucinatory thinker and Hugo wanted to explore new spaces with him so we hope you enjoy the experiments they play as they begin to discuss open-source software in the broader context of finite and infinite games and how OSS is a paradigm of humanity’s ability to create generative, nourishing and anti-rivlarous systems where, by anti-rivalrous, we mean things that become more valuable for everyone the more people use them! But we need to be mindful of finite-game dynamics (for example, those driven by corporate incentives) co-opting and parasitizing the generative systems that we build.</p>

<p>These are all considerations they delve far deeper into in Part 2 of this interview, which will be the next episode of VG, where we also dive into the relationship  between OSS, tools, and venture capital, amonh many others things.</p>

<p><strong>LInks</strong></p>

<ul>
<li><a href="https://twitter.com/pwang" rel="nofollow">Peter on twitter</a></li>
<li><a href="https://anaconda.cloud/" rel="nofollow">Anaconda Nucleus</a></li>
<li><a href="https://ilovesymposia.com/2015/04/03/calling-out-scipy-on-diversity/" rel="nofollow">Calling out SciPy on diversity (even though it hurts)</a> by Juan Nunez-Iglesias</li>
<li><a href="https://en.wikipedia.org/wiki/Here_Comes_Everybody_(book)" rel="nofollow">Here Comes Everybody: The Power of Organizing Without Organizations</a> by Clay Shirky</li>
<li><a href="https://en.wikipedia.org/wiki/Finite_and_Infinite_Games" rel="nofollow">Finite and Infinite Games</a> by James Carse</li>
<li><a href="https://www.cambridge.org/core/books/governing-the-commons/7AB7AE11BADA84409C34815CC288CD79" rel="nofollow">Governing the Commons: The Evolution of Institutions for Collective Action</a> by Elinor Olstrom</li>
<li><a href="https://www.onthecommons.org/magazine/elinor-ostroms-8-principles-managing-commmons" rel="nofollow">Elinor Ostrom&#39;s 8 Principles for Managing A Commmons</a></li>
</ul>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>Hugo speaks with Peter Wang, CEO of Anaconda, about how Python became so big in data science, machine learning, and AI. They jump into many of the technical and sociological beginnings of Python being used for data science, a history of PyData, the conda distribution, and NUMFOCUS.</p>

<p>They also talk about the emergence of online collaborative environments, particularly with respect to open source, and attempt to figure out the movings parts of PyData and why it has had the impact it has, including the fact that many core developers were not computer scientists or software engineers, but rather scientists and researchers building tools that they needed on an as-needed basis</p>

<p>They also discuss the challenges in getting adoption for Python and the things that the PyData stack solves, those that it doesn’t and what progress is being made there.</p>

<p>People who have listened to Hugo podcast for some time may have recognized that he&#39;s interested in the sociology of the data science space and he really considered speaking with Peter a fascinating opportunity to delve into how the Pythonic data science space evolved, particularly with respect to tooling, not only because Peter had a front row seat for much of it, but that he was one of several key actors at various different points. On top of this, Hugo wanted to allow Peter’s inner sociologist room to breathe and evolve in this conversation. </p>

<p>What happens then is slightly experimental – Peter is a deep, broad, and occasionally hallucinatory thinker and Hugo wanted to explore new spaces with him so we hope you enjoy the experiments they play as they begin to discuss open-source software in the broader context of finite and infinite games and how OSS is a paradigm of humanity’s ability to create generative, nourishing and anti-rivlarous systems where, by anti-rivalrous, we mean things that become more valuable for everyone the more people use them! But we need to be mindful of finite-game dynamics (for example, those driven by corporate incentives) co-opting and parasitizing the generative systems that we build.</p>

<p>These are all considerations they delve far deeper into in Part 2 of this interview, which will be the next episode of VG, where we also dive into the relationship  between OSS, tools, and venture capital, amonh many others things.</p>

<p><strong>LInks</strong></p>

<ul>
<li><a href="https://twitter.com/pwang" rel="nofollow">Peter on twitter</a></li>
<li><a href="https://anaconda.cloud/" rel="nofollow">Anaconda Nucleus</a></li>
<li><a href="https://ilovesymposia.com/2015/04/03/calling-out-scipy-on-diversity/" rel="nofollow">Calling out SciPy on diversity (even though it hurts)</a> by Juan Nunez-Iglesias</li>
<li><a href="https://en.wikipedia.org/wiki/Here_Comes_Everybody_(book)" rel="nofollow">Here Comes Everybody: The Power of Organizing Without Organizations</a> by Clay Shirky</li>
<li><a href="https://en.wikipedia.org/wiki/Finite_and_Infinite_Games" rel="nofollow">Finite and Infinite Games</a> by James Carse</li>
<li><a href="https://www.cambridge.org/core/books/governing-the-commons/7AB7AE11BADA84409C34815CC288CD79" rel="nofollow">Governing the Commons: The Evolution of Institutions for Collective Action</a> by Elinor Olstrom</li>
<li><a href="https://www.onthecommons.org/magazine/elinor-ostroms-8-principles-managing-commmons" rel="nofollow">Elinor Ostrom&#39;s 8 Principles for Managing A Commmons</a></li>
</ul>]]>
  </itunes:summary>
</item>
<item>
  <title>Episode 5: Executive Data Science</title>
  <link>https://vanishinggradients.fireside.fm/5</link>
  <guid isPermaLink="false">9078010f-454b-4bcf-bafc-f54f44e04868</guid>
  <pubDate>Wed, 23 Mar 2022 16:00:00 +1100</pubDate>
  <author>Hugo Bowne-Anderson</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/140c3904-8258-4c39-a698-a112b7077bd7/9078010f-454b-4bcf-bafc-f54f44e04868.mp3" length="103917601" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:season>1</itunes:season>
  <itunes:author>Hugo Bowne-Anderson</itunes:author>
  <itunes:subtitle>Hugo speaks with Jim Savage, the Director of Data Science at Schmidt Futures, about the need for data science in executive training and decision, what data scientists can learn from economists, the perils of "data for good", and why you should always be integrating your loss function over your posterior.</itunes:subtitle>
  <itunes:duration>1:48:14</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/1/140c3904-8258-4c39-a698-a112b7077bd7/cover.jpg?v=1"/>
  <description>Hugo speaks with Jim Savage, the Director of Data Science at Schmidt Futures, about the need for data science in executive training and decision, what data scientists can learn from economists, the perils of "data for good", and why you should always be integrating your loss function over your posterior.
Jim and Hugo talk about what data science is and isn’t capable of, what can actually deliver value, and what people really enjoy doing: the intersection in this Venn diagram is where we need to focus energy and it may not be quite what you think it is!
They then dive into Jim's thoughts on what he dubs Executive Data Science. You may be aware of the slicing of the data science and machine learning spaces into descriptive analytics, predictive analytics, and prescriptive analytics but, being the thought surgeon that he is, Jim proposes a different slicing into 
(1) tool building OR data science as a product, 
(2) tools to automate and augment parts of us, and 
(3) what Jim calls Executive Data Science.
Jim and Hugo also talk about decision theory, the woeful state of causal inference techniques in contemporary data science, and what techniques it would behoove us all to import from econometrics and economics, more generally. If that’s not enough, they talk about the importance of thinking through the data generating process and things that can go wrong if you don’t. In terms of allowing your data work to inform your decision making, thery also discuss Jim’s maxim “ALWAYS BE INTEGRATING YOUR LOSS FUNCTION OVER YOUR POSTERIOR”
Last but definitively not least, as Jim has worked in the data for good space for much of his career, they talk about what this actually means, with particular reference to fast.ai founder &amp;amp; QUT professor of practice Rachel Thomas’  blog post called “Doing Data Science for Social Good, Responsibly” (https://www.fast.ai/2021/11/23/data-for-good/). Rachel’s post takes as its starting point the following words of Sarah Hooker, a researcher at Google Brain:
"Data for good" is an imprecise term that says little about who we serve, the tools used, or the goals. Being more precise can help us be more accountable &amp;amp; have a greater positive impact.
And Jim and I discuss his work in the light of these foundational considerations.
Links
Jim on twitter (https://twitter.com/abiylfoyp/)
What Is Causal Inference?An Introduction for Data Scientists (https://www.oreilly.com/radar/what-is-causal-inference/) by Hugo Bowne-Anderson and Mike Loukides
 Jim's must-watch Data Council talk on Productizing Structural Models (https://www.datacouncil.ai/talks/productizing-structural-models)
 [Mastering Metrics}(https://www.masteringmetrics.com/) by Angrist and Pischke
 Mostly Harmless Econometrics: An Empiricist's Companion (https://press.princeton.edu/books/paperback/9780691120355/mostly-harmless-econometrics) by Angrist and Pischke
 The Book of Why (https://en.wikipedia.org/wiki/The_Book_of_Why) by Judea Pearl
Decision-Making in a Time of Crisis (https://www.oreilly.com/radar/decision-making-in-a-time-of-crisis/) by Hugo Bowne-Anderson
Doing Data Science for Social Good, Responsibly (https://www.fast.ai/2021/11/23/data-for-good/) by Rachel Thomas
</description>
  <itunes:keywords>data science, executive, machine learning, economics, AI</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>Hugo speaks with Jim Savage, the Director of Data Science at Schmidt Futures, about the need for data science in executive training and decision, what data scientists can learn from economists, the perils of &quot;data for good&quot;, and why you should always be integrating your loss function over your posterior.</p>

<p>Jim and Hugo talk about what data science is and isn’t capable of, what can actually deliver value, and what people really enjoy doing: the intersection in this Venn diagram is where we need to focus energy and it may not be quite what you think it is!</p>

<p>They then dive into Jim&#39;s thoughts on what he dubs Executive Data Science. You may be aware of the slicing of the data science and machine learning spaces into descriptive analytics, predictive analytics, and prescriptive analytics but, being the thought surgeon that he is, Jim proposes a different slicing into </p>

<p>(1) tool building OR data science as a product, </p>

<p>(2) tools to automate and augment parts of us, and </p>

<p>(3) what Jim calls Executive Data Science.</p>

<p>Jim and Hugo also talk about decision theory, the woeful state of causal inference techniques in contemporary data science, and what techniques it would behoove us all to import from econometrics and economics, more generally. If that’s not enough, they talk about the importance of thinking through the data generating process and things that can go wrong if you don’t. In terms of allowing your data work to inform your decision making, thery also discuss Jim’s maxim “ALWAYS BE INTEGRATING YOUR LOSS FUNCTION OVER YOUR POSTERIOR”</p>

<p>Last but definitively not least, as Jim has worked in the data for good space for much of his career, they talk about what this actually means, with particular reference to fast.ai founder &amp; QUT professor of practice Rachel Thomas’  blog post called <a href="https://www.fast.ai/2021/11/23/data-for-good/" rel="nofollow">“Doing Data Science for Social Good, Responsibly”</a>. Rachel’s post takes as its starting point the following words of Sarah Hooker, a researcher at Google Brain:</p>

<blockquote>
<p>&quot;Data for good&quot; is an imprecise term that says little about who we serve, the tools used, or the goals. Being more precise can help us be more accountable &amp; have a greater positive impact.</p>
</blockquote>

<p>And Jim and I discuss his work in the light of these foundational considerations.</p>

<p><strong>Links</strong></p>

<ul>
<li><a href="https://twitter.com/abiylfoyp/" rel="nofollow">Jim on twitter</a></li>
<li><a href="https://www.oreilly.com/radar/what-is-causal-inference/" rel="nofollow">What Is Causal Inference?An Introduction for Data Scientists</a> by Hugo Bowne-Anderson and Mike Loukides</li>
<li> Jim&#39;s must-watch Data Council talk on <a href="https://www.datacouncil.ai/talks/productizing-structural-models" rel="nofollow">Productizing Structural Models</a></li>
<li> [Mastering Metrics}(<a href="https://www.masteringmetrics.com/" rel="nofollow">https://www.masteringmetrics.com/</a>) by Angrist and Pischke</li>
<li> <a href="https://press.princeton.edu/books/paperback/9780691120355/mostly-harmless-econometrics" rel="nofollow">Mostly Harmless Econometrics: An Empiricist&#39;s Companion</a> by Angrist and Pischke</li>
<li> <a href="https://en.wikipedia.org/wiki/The_Book_of_Why" rel="nofollow">The Book of Why</a> by Judea Pearl</li>
<li><a href="https://www.oreilly.com/radar/decision-making-in-a-time-of-crisis/" rel="nofollow">Decision-Making in a Time of Crisis</a> by Hugo Bowne-Anderson</li>
<li><a href="https://www.fast.ai/2021/11/23/data-for-good/" rel="nofollow">Doing Data Science for Social Good, Responsibly</a> by Rachel Thomas</li>
</ul>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>Hugo speaks with Jim Savage, the Director of Data Science at Schmidt Futures, about the need for data science in executive training and decision, what data scientists can learn from economists, the perils of &quot;data for good&quot;, and why you should always be integrating your loss function over your posterior.</p>

<p>Jim and Hugo talk about what data science is and isn’t capable of, what can actually deliver value, and what people really enjoy doing: the intersection in this Venn diagram is where we need to focus energy and it may not be quite what you think it is!</p>

<p>They then dive into Jim&#39;s thoughts on what he dubs Executive Data Science. You may be aware of the slicing of the data science and machine learning spaces into descriptive analytics, predictive analytics, and prescriptive analytics but, being the thought surgeon that he is, Jim proposes a different slicing into </p>

<p>(1) tool building OR data science as a product, </p>

<p>(2) tools to automate and augment parts of us, and </p>

<p>(3) what Jim calls Executive Data Science.</p>

<p>Jim and Hugo also talk about decision theory, the woeful state of causal inference techniques in contemporary data science, and what techniques it would behoove us all to import from econometrics and economics, more generally. If that’s not enough, they talk about the importance of thinking through the data generating process and things that can go wrong if you don’t. In terms of allowing your data work to inform your decision making, thery also discuss Jim’s maxim “ALWAYS BE INTEGRATING YOUR LOSS FUNCTION OVER YOUR POSTERIOR”</p>

<p>Last but definitively not least, as Jim has worked in the data for good space for much of his career, they talk about what this actually means, with particular reference to fast.ai founder &amp; QUT professor of practice Rachel Thomas’  blog post called <a href="https://www.fast.ai/2021/11/23/data-for-good/" rel="nofollow">“Doing Data Science for Social Good, Responsibly”</a>. Rachel’s post takes as its starting point the following words of Sarah Hooker, a researcher at Google Brain:</p>

<blockquote>
<p>&quot;Data for good&quot; is an imprecise term that says little about who we serve, the tools used, or the goals. Being more precise can help us be more accountable &amp; have a greater positive impact.</p>
</blockquote>

<p>And Jim and I discuss his work in the light of these foundational considerations.</p>

<p><strong>Links</strong></p>

<ul>
<li><a href="https://twitter.com/abiylfoyp/" rel="nofollow">Jim on twitter</a></li>
<li><a href="https://www.oreilly.com/radar/what-is-causal-inference/" rel="nofollow">What Is Causal Inference?An Introduction for Data Scientists</a> by Hugo Bowne-Anderson and Mike Loukides</li>
<li> Jim&#39;s must-watch Data Council talk on <a href="https://www.datacouncil.ai/talks/productizing-structural-models" rel="nofollow">Productizing Structural Models</a></li>
<li> [Mastering Metrics}(<a href="https://www.masteringmetrics.com/" rel="nofollow">https://www.masteringmetrics.com/</a>) by Angrist and Pischke</li>
<li> <a href="https://press.princeton.edu/books/paperback/9780691120355/mostly-harmless-econometrics" rel="nofollow">Mostly Harmless Econometrics: An Empiricist&#39;s Companion</a> by Angrist and Pischke</li>
<li> <a href="https://en.wikipedia.org/wiki/The_Book_of_Why" rel="nofollow">The Book of Why</a> by Judea Pearl</li>
<li><a href="https://www.oreilly.com/radar/decision-making-in-a-time-of-crisis/" rel="nofollow">Decision-Making in a Time of Crisis</a> by Hugo Bowne-Anderson</li>
<li><a href="https://www.fast.ai/2021/11/23/data-for-good/" rel="nofollow">Doing Data Science for Social Good, Responsibly</a> by Rachel Thomas</li>
</ul>]]>
  </itunes:summary>
</item>
  </channel>
</rss>
