DeepMind Gemini 3 Lead: What Comes After "Infinite Data"

https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/8d/1b/42/8d1b4264-d09c-a05b-2cdd-0beef7a69e56/mza_8833058394975998764.jpg/600x600bb.jpg

The MAD Podcast with Matt Turck

Matt Turck

103 episodes

3 days ago

The MAD Podcast with Matt Turck, is a series of conversations with leaders from across the Machine Learning, AI, & Data landscape hosted by leading AI & data investor and Partner at FirstMark Capital, Matt Turck.

Technology

RSS

All content for The MAD Podcast with Matt Turck is the property of Matt Turck and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.

Technology

https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/40657026/40657026-1734365503137-220c3042da297.jpg

DeepMind Gemini 3 Lead: What Comes After "Infinite Data"

The MAD Podcast with Matt Turck

54 minutes 56 seconds

2 weeks ago

DeepMind Gemini 3 Lead: What Comes After "Infinite Data"

Gemini 3 was a landmark frontier model launch in AI this year — but the story behind its performance isn’t just about adding more compute. In this episode, I sit down with Sebastian Bourgeaud, a pre-training lead for Gemini 3 at Google DeepMind and co-author of the seminal RETRO paper. In his first-ever podcast interview, Sebastian takes us inside the lab mindset behind Google’s most powerful model — what actually changed, and why the real work today is no longer “training a model,” but building a full system.

We unpack the “secret recipe” idea — the notion that big leaps come from better pre-training and better post-training — and use it to explore a deeper shift in the industry: moving from an “infinite data” era to a data-limited regime, where curation, proxies, and measurement matter as much as web-scale volume. Sebastian explains why scaling laws aren’t dead, but evolving, why evals have become one of the hardest and most underrated problems (including benchmark contamination), and why frontier research is increasingly a full-stack discipline that spans data, infrastructure, and engineering as much as algorithms.

From the intuition behind Deep Think, to the rise (and risks) of synthetic data loops, to the future of long-context and retrieval, this is a technical deep dive into the physics of frontier AI. We also get into continual learning — what it would take for models to keep updating with new knowledge over time, whether via tools, expanding context, or new training paradigms — and what that implies for where foundation models are headed next. If you want a grounded view of pre-training in late 2025 beyond the marketing layer, this conversation is a blueprint.

Google DeepMind

Website - https://deepmind.google

X/Twitter - https://x.com/GoogleDeepMind

Sebastian Borgeaud

LinkedIn - https://www.linkedin.com/in/sebastian-borgeaud-8648a5aa/

X/Twitter - https://x.com/borgeaud_s

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

Blog - https://mattturck.com

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

(00:00) – Cold intro: “We’re ahead of schedule” + AI is now a system

(00:58) – Oriol’s “secret recipe”: better pre- + post-training

(02:09) – Why AI progress still isn’t slowing down

(03:04) – Are models actually getting smarter?

(04:36) – Two–three years out: what changes first?

(06:34) – AI doing AI research: faster, not automated

(07:45) – Frontier labs: same playbook or different bets?

(10:19) – Post-transformers: will a disruption happen?

(10:51) – DeepMind’s advantage: research × engineering × infra

(12:26) – What a Gemini 3 pre-training lead actually does

(13:59) – From Europe to Cambridge to DeepMind

(18:06) – Why he left RL for real-world data

(20:05) – From Gopher to Chinchilla to RETRO (and why it matters)

(20:28) – “Research taste”: integrate or slow everyone down

(23:00) – Fixes vs moonshots: how they balance the pipeline

(24:37) – Research vs product pressure (and org structure)

(26:24) – Gemini 3 under the hood: MoE in plain English

(28:30) – Native multimodality: the hidden costs

(30:03) – Scaling laws aren’t dead (but scale isn’t everything)

(33:07) – Synthetic data: powerful, dangerous

(35:00) – Reasoning traces: what he can’t say (and why)

(37:18) – Long context + attention: what’s next

(38:40) – Retrieval vs RAG vs long context

(41:49) – The real boss fight: evals (and contamination)

(42:28) – Alignment: pre-training vs post-training

(43:32) – Deep Think + agents + “vibe coding”

(46:34) – Continual learning: updating models over time

(49:35) – Advice for researchers + founders

(53:35) – “No end in sight” for progress + closing