Home
Categories
EXPLORE
True Crime
Comedy
Business
Society & Culture
Technology
History
Health & Fitness
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/25/55/8c/25558ca2-b5ac-4882-200a-e949206893bf/mza_11064771308350760706.png/600x600bb.jpg
The Information Bottleneck
Ravid Shwartz-Ziv & Allen Roush
20 episodes
5 days ago
Two AI Researchers - Ravid Shwartz Ziv, and Allen Roush, discuss the latest trends, news, and research within Generative AI, LLMs, GPUs, and Cloud Systems.
Show more...
Technology
Science
RSS
All content for The Information Bottleneck is the property of Ravid Shwartz-Ziv & Allen Roush and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
Two AI Researchers - Ravid Shwartz Ziv, and Allen Roush, discuss the latest trends, news, and research within Generative AI, LLMs, GPUs, and Cloud Systems.
Show more...
Technology
Science
Episodes (20/20)
The Information Bottleneck
EP20: Yann LeCun

Yann LeCun – Why LLMs Will Never Get Us to AGI

"The path to superintelligence - just train up the LLMs, train on more synthetic data, hire thousands of people to school your system in post-training, invent new tweaks on RL-I think is complete bullshit. It's just never going to work."

After 12 years at Meta, Turing Award winner Yann LeCun is betting his legacy on a radically different vision of AI. In this conversation, he explains why Silicon Valley's obsession with scaling language models is a dead end, why the hardest problem in AI is reaching dog-level intelligence (not human-level), and why his new company AMI is building world models that predict in abstract representation space rather than generating pixels.


Timestamps

(00:00:14) – Intro and welcome

(00:01:12) – AMI: Why start a company now?

(00:04:46) – Will AMI do research in the open?

(00:06:44) – World models vs LLMs

(00:09:44) – History of self-supervised learning

(00:16:55) – Siamese networks and contrastive learning

(00:25:14) – JEPA and learning in representation space

(00:30:14) – Abstraction hierarchies in physics and AI

(00:34:01) – World models as abstract simulators

(00:38:14) – Object permanence and learning basic physics

(00:40:35) – Game AI: Why NetHack is still impossible

(00:44:22) – Moravec's Paradox and chess

(00:55:14) – AI safety by construction, not fine-tuning

(01:02:52) – Constrained generation techniques

(01:04:20) – Meta's reorganization and FAIR's future

(01:07:31) – SSI, Physical Intelligence, and Wayve

(01:10:14) – Silicon Valley's "LLM-pilled" monoculture

(01:15:56) – China vs US: The open source paradox

(01:18:14) – Why start a company at 65?

(01:25:14) – The AGI hype cycle has happened 6 times before

(01:33:18) – Family and personal background

(01:36:13) – Career advice: Learn things with a long shelf life

(01:40:14) – Neuroscience and machine learning connections

(01:48:17) – Continual learning: Is catastrophic forgetting solved?


Music:

"Kid Kodi" — Blue Dot Sessions — via Free Music Archive — CC BY-NC 4.0.

"Palms Down" — Blue Dot Sessions — via Free Music Archive — CC BY-NC 4.0.

Changes: trimmed


About

The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.


Show more...
2 weeks ago
1 hour 50 minutes 6 seconds

The Information Bottleneck
EP19: AI in Finance and Symbolic AI with Atlas Wang

Atlas Wang (UT Austin faculty, XTX Research Director) joins us to explore two fascinating frontiers: the foundations of symbolic AI and the practical challenges of building AI systems for quantitative finance.

On the symbolic AI side, Atlas shares his recent work proving that neural networks can learn symbolic equations through gradient descent, a surprising result given that gradient descent is continuous while symbolic structures are discrete. We talked about why neural nets learn clean, compositional mathematical structures at all, what the mathematical tools involved are, and the broader implications for understanding reasoning in AI systems.

The conversation then turns to neuro-symbolic approaches in practice: agents that discover rules through continued learning, propose them symbolically, verify them against domain-specific checkers, and refine their understanding.

On the finance side, Atlas pulls back the curtain on what AI research looks like at a high-frequency trading firm. The core problem sounds simple (predict future prices from past data). Still, the challenge is extreme: markets are dominated by noise, predictions hover near zero correlation, and success means eking out tiny margins across astronomical numbers of trades. He explains why synthetic data techniques that work elsewhere don't translate easily, and why XTX is building time series foundation models rather than adapting language models.

We also discuss the convergence of hiring between frontier AI labs and quantitative finance, and why this is an exceptional moment for ML researchers to consider the finance industry.

Links:

  • Why Neural Network Can Discover Symbolic Structures with Gradient-based Training: An Algebraic and Geometric Foundation for Neurosymbolic Reasoning - arxiv.org/abs/2506.21797
  • Atlas website - https://www.vita-group.space/

Guest: Atlas Wang (UT Austin / XTX)

Hosts: Ravid Shwartz-Ziv & Allen Roush

Music: “Kid Kodi” — Blue Dot Sessions. Source: Free Music Archive. Licensed CC BY-NC 4.0.

Show more...
3 weeks ago
1 hour 10 minutes 34 seconds

The Information Bottleneck
EP18: AI Robotics

In this episode, we hosted Judah Goldfeder, a PhD candidate at Columbia University and student researcher at Google, to discuss robotics, reproducibility in ML, and smart buildings.

Key topics covered:

Robotics challenges: We discussed why robotics remains harder than many expected, compared to LLMs. The real world is unpredictable and unforgiving, and mistakes have physical consequences. Sim-to-real transfer remains a major bottleneck because simulators are tedious to configure accurately for each robot and environment. Unlike text, robotics lacks foundation models, partly due to limited clean, annotated datasets and the difficulty of collecting diverse real-world data.

Reproducibility crisis: We discussed how self-reported benchmarks can lead to p-hacking and irreproducible results. Centralized evaluation systems (such as Kaggle or ImageNet challenges), where researchers submit algorithms for testing on hidden test sets, seem to drive faster progress.

Smart buildings: Judah's work at Google focuses on using ML to optimize HVAC systems, potentially reducing energy costs and carbon emissions significantly. The challenge is that every building is different. It makes the simulation configuration extremely labor-intensive. Generative AI could help by automating the process of converting floor plans or images into accurate building simulations.

Links:

  • Judah website - https://judahgoldfeder.com/

Music:

"Kid Kodi" — Blue Dot Sessions — via Free Music Archive — CC BY-NC 4.0.

"Palms Down" — Blue Dot Sessions — via Free Music Archive — CC BY-NC 4.0.

Changes: trimmed

Show more...
1 month ago
1 hour 45 minutes 16 seconds

The Information Bottleneck
EP17: RL with Will Brown

In this episode, we talk with Will Brown, a research lead at Prime Intellect, about his journey into reinforcement learning (RL) and multi-agent systems, exploring their theoretical foundations and practical applications. We discuss the importance of RL in the current LLMs pipeline and the challenges it faces. We also discuss applying agentic workflows to real-world applications and the ongoing evolution of AI development.

Chapters

00:00 Introduction to Reinforcement Learning and Will's Journey

03:10 Theoretical Foundations of Multi-Agent Systems

06:09 Transitioning from Theory to Practical Applications

09:01 The Role of Game Theory in AI

11:55 Exploring the Complexity of Games and AI

14:56 Optimization Techniques in Reinforcement Learning

17:58 The Evolution of RL in LLMs

21:04 Challenges and Opportunities in RL for LLMs

23:56 Key Components for Successful RL Implementation

27:00 Future Directions in Reinforcement Learning

36:29 Exploring Agentic Reinforcement Learning Paradigms

38:45 The Role of Intermediate Results in RL

41:16 Multi-Agent Systems: Challenges and Opportunities

45:08 Distributed Environments and Decentralized RL

49:31 Prompt Optimization Techniques in RL

52:25 Statistical Rigor in Evaluations

55:49 Future Directions in Reinforcement Learning

59:50 Task-Specific Models vs. General Models

01:02:04 Insights on Random Verifiers and Learning Dynamics

01:04:39 Real-World Applications of RL and Evaluation Challenges

01:05:58 Prime RL Framework: Goals and Trade-offs

01:10:38 Open Source vs. Closed Source Models

01:13:08 Continuous Learning and Knowledge Improvement

Music:

"Kid Kodi" — Blue Dot Sessions — via Free Music Archive — CC BY-NC 4.0.

"Palms Down" — Blue Dot Sessions — via Free Music Archive — CC BY-NC 4.0.

Changes: trimmed

Show more...
1 month ago
1 hour 5 minutes 43 seconds

The Information Bottleneck
EP16: AI News and Papers

In this episode, we discuss various topics in AI, including the challenges of the conference review process, the capabilities of Kimi K2 thinking, the advancements in TPU technology, the significance of real-world data in robotics, and recent innovations in AI research. We also talk about the cool "Chain of Thought Hijacking" paper, how to use simple ideas to scale RL, and the implications of the Cosmos project, which aims to enable autonomous scientific discovery through AI.

Papers and links:

  • Chain-of-Thought Hijacking - https://arxiv.org/pdf/2510.26418
  • Kosmos: An AI Scientist for Autonomous Discovery - https://t.co/9pCr6AUXAe
  • JustRL: Scaling a 1.5B LLM with a Simple RL Recipe - https://relieved-cafe-fe1.notion.site/JustRL-Scaling-a-1-5B-LLM-with-a-Simple-RL-Recipe-24f6198b0b6b80e48e74f519bfdaf0a8

Chapters

00:00 Navigating the Peer Review Process

04:17 Kimi K2 Thinking: A New Era in AI

12:27 The Future of Tool Calls in AI

17:12 Exploring Google's New TPUs

22:04 The Importance of Real-World Data in Robotics

28:10 World Models: The Next Frontier in AI

31:36 Nvidia's Dominance in AI Partnerships

32:08 Exploring Recent AI Research Papers

37:46 Chain of Thought Hijacking: A New Threat

43:05 Simplifying Reinforcement Learning Training

54:03 Cosmos: AI for Autonomous Scientific Discovery

Music:

"Kid Kodi" — Blue Dot Sessions — via Free Music Archive — CC BY-NC 4.0.

"Palms Down" — Blue Dot Sessions — via Free Music Archive — CC BY-NC 4.0.

Changes: trimmed

Show more...
1 month ago
59 minutes 20 seconds

The Information Bottleneck
EP15: The Information Bottleneck and Scaling Laws with Alex Alemi

In this episode, we sit down with Alex Alemi, an AI researcher at Anthropic (previously at Google Brain and Disney), to explore the powerful framework of the information bottleneck and its profound implications for modern machine learning.

We break down what the information bottleneck really means, a principled approach to retaining only the most informative parts of data while compressing away the irrelevant. We discuss why compression is still important in our era of big data, how it prevents overfitting, and why it's essential for building models that generalize well.

We also dive into scaling laws: why they matter, what we can learn from them, and what they tell us about the future of AI research.

Papers and links:

  • Alex's website - https://www.alexalemi.com/
  • Scaling exponents across parameterizations and optimizers - https://arxiv.org/abs/2407.05872
  • Deep Variational Information Bottleneck - https://arxiv.org/abs/1612.00410
  • Layer by Layer: Uncovering Hidden Representations in Language Models - https://arxiv.org/abs/2502.02013
  • Information in Infinite Ensembles of Infinitely-Wide Neural Networks - https://proceedings.mlr.press/v118/shwartz-ziv20a.html

Music:

“Kid Kodi” — Blue Dot Sessions — via Free Music Archive — CC BY-NC 4.0.

“Palms Down” — Blue Dot Sessions — via Free Music Archive — CC BY-NC 4.0.

Changes: trimmed

Show more...
1 month ago
1 hour 22 minutes 50 seconds

The Information Bottleneck
EP14: AI News and Papers

In this episode, we talked about AI news and recent papers. We explored the complexities of using AI models in healthcare (the Nature Medicine paper on GPT-5's fragile intelligence in medical contexts). We discussed the delicate balance between leveraging LLMs as powerful research tools and the risks of over-reliance, touching on issues such as hallucinations, medical disagreements among practitioners, and the need for better education on responsible AI use in healthcare.

We also talked about Stanford's "Cartridges" paper, which presents an innovative approach to long-context language models. The paper tackles the expensive computational costs of billion-token context windows by compressing KV caches through a clever "self-study" method using synthetic question-answer pairs and context distillation. We discussed the implications for personalization, composability, and making long-context models more practical.

Additionally, we explored the "Continuous Autoregressive Language Models" paper and touched on insights from the Smol Training Playbook.

Papers discussed:

  • The fragile intelligence of GPT-5 in medicine: https://www.nature.com/articles/s41591-025-04008-8
  • Cartridges: Lightweight and general-purpose long context representations via self-study: https://arxiv.org/abs/2506.06266
  • Continuous Autoregressive Language Models: https://arxiv.org/abs/2510.27688
  • The Smol Training Playbook: https://huggingface.co/spaces/HuggingFaceTB/smol-training-playbook

Music:

“Kid Kodi” — Blue Dot Sessions — via Free Music Archive — CC BY-NC 4.0.

“Palms Down” — Blue Dot Sessions — via Free Music Archive — CC BY-NC 4.0.

Changes: trimmed

This is an experimental format for us, just news and papers without a guest interview. Let us know what you think!

Show more...
1 month ago
57 minutes 20 seconds

The Information Bottleneck
EP13: Recurrent-Depth Models and Latent Reasoning with Jonas Geiping

In this episode, we host Jonas Geiping from ELLIS Institute & Max-Planck Institute for Intelligent Systems, Tübingen AI Center, Germany. We talked about his broad research on Recurrent-Depth Models and latent reasoning in large language models (LLMs). We talked about what these models can and can't do, what are the challenges and next breakthroughs in the field, world models, and the future of developing better models. We also talked about safety and interpretability, and the role of scaling laws in AI development.

Chapters

00:00 Introduction and Guest Introduction

01:03 Peer Review in Preprint Servers

06:57 New Developments in Coding Models

09:34 Open Source Models in Europe

11:00 Dynamic Layers in LLMs

26:05 Training Playbook Insights

30:05 Recurrent Depth Models and Reasoning Tasks

43:59 Exploring Recursive Reasoning Models

46:46 The Role of World Models in AI

48:41 Innovations in AI Training and Simulation

50:39 The Promise of Recurrent Depth Models

52:34 Navigating the Future of AI Algorithms

54:44 The Bitter Lesson of AI Development

59:11 Advising the Next Generation of Researchers

01:06:42 Safety and Interpretability in AI Models

01:10:46 Scaling Laws and Their Implications

01:16:19 The Role of PhDs in AI Research

Links and paper:

  • Jonas' website - https://jonasgeiping.github.io/
  • Scaling up test-time compute with latent reasoning: A recurrent depth approach - https://arxiv.org/abs/2502.05171
  • The Smol Training Playbook: The Secrets to Building World-Class LLMs - https://huggingface.co/spaces/HuggingFaceTB/smol-training-playbook
  • VaultGemma: A Differentially Private Gemma Model - https://arxiv.org/abs/2510.15001

Music:

“Kid Kodi” — Blue Dot Sessions — via Free Music Archive — CC BY-NC 4.0.

“Palms Down” — Blue Dot Sessions — via Free Music Archive — CC BY-NC 4.0.

Changes: trimmed

Show more...
1 month ago
1 hour 21 minutes 15 seconds

The Information Bottleneck
EP12: Adversarial attacks and compression with Jack Morris

In this episode of the Information Bottleneck Podcast, we host Jack Morris, a PhD student at Cornell, to discuss adversarial examples (Jack created TextAttack, the first software package for LLM jailbreaking), the Platonic representation hypothesis, the implications of inversion techniques, and the role of compression in language models.

Links:

Jack's Website - https://jxmo.io/

TextAttack - https://arxiv.org/abs/2005.05909

How much do language models memorize? https://arxiv.org/abs/2505.24832

DeepSeek OCR - https://www.arxiv.org/abs/2510.18234

Chapters:

00:00 Introduction and AI News Highlights

04:53 The Importance of Fine-Tuning Models

10:01 Challenges in Open Source AI Models

14:34 The Future of Model Scaling and Sparsity

19:39 Exploring Model Routing and User Experience

24:34 Jack's Research: Text Attack and Adversarial Examples

29:33 The Platonic Representation Hypothesis

34:23 Implications of Inversion and Security in AI

39:20 The Role of Compression in Language Models

44:10 Future Directions in AI Research and Personalization

Show more...
2 months ago
58 minutes 7 seconds

The Information Bottleneck
EP11: JEPA with Randall Balestriero

In this episode we talk with Randall Balestriero, an assistant professor at Brown University. We discuss the potential and challenges of Joint Embedding Predictive Architectures (JEPA). We explore the concept of JEPA, which aims to learn good data representations without reconstruction-based learning. We talk about the importance of understanding and compressing irrelevant details, the role of prediction tasks, and the challenges of preventing collapse.

Show more...
2 months ago
1 hour 18 minutes 4 seconds

The Information Bottleneck
EP10: Geometric Deep Learning with Michael Bronstein

In this episode, we talked with Michael Bronstein, a professor of AI at the University of Oxford and a scientific director at AITHYRA, about the fascinating world of geometric deep learning. We explored how understanding the geometric structures in data can enhance the efficiency and accuracy of AI models. Michael shared insights on the limitations of small neural networks and the ongoing debate about the role of scaling in AI. We also talked about the future in scientific discovery, and the potential impact on fields like drug design and mathematics

Show more...
2 months ago
1 hour 17 minutes 49 seconds

The Information Bottleneck
EP9: AI in Natural Sciences with Tal Kachman

In this episode we host Tal Kachman, an assistant professor at Radboud University, to explore the fascinating intersection of artificial intelligence and natural sciences. Prof. Kachman's research focuses on multiagent interaction, complex systems, and reinforcement learning. We dive deep into how AI is revolutionizing materials discovery, chemical dynamics modeling, and experimental design through self-driving laboratories. Prof. Kachman shares insights on the challenges of integrating physics and chemistry with AI systems, the critical role of high-throughput experimentation in accelerating scientific discovery, and the transformative potential of generative models to unlock new materials and functionalities.

Show more...
2 months ago
1 hour 7 minutes 42 seconds

The Information Bottleneck
EP8: RL with Ahmad Beirami

In this episode, we talked with Ahmad Birami, an ex-researcher at Google, to discuss various topics in AI. We explored the complexities of reinforcement learning, its applications in LLMs, and the evaluation challenges in AI research. We also discussed the dynamics of academic conferences and the broken review system. Finally, we discussed how to integrate theory and practice in AI research and why the community should prioritize a deeper understanding over surface-level improvements.

Show more...
2 months ago
1 hour 7 minutes 9 seconds

The Information Bottleneck
EP7: AI and Neuroscience with Aran Nayebi

In this episode of the "Information Bottleneck" podcast, we hosted Aran Nayeb, an assistant professor at Carnegie Mellon University, to discuss the intersection of computational neuroscience and machine learning. We talked about the challenges and opportunities in understanding intelligence through the lens of both biological and artificial systems. We talked about topics such as the evolution of neural networks, the role of intrinsic motivation in AI, and the future of brain-machine interfaces.

Show more...
3 months ago
1 hour 9 minutes 12 seconds

The Information Bottleneck
EP6: Urban Design Meets AI: With Ariel Noyman

We talked with Ariel Noyman, an urban scientist, working in the intersection of cities and technology. Ariel is a research scientist at the MIT Media Lab, exploring novel methods of urban modeling and simulation using AI. We discussed the potential of virtual environments to enhance urban design processes, the challenges associated with them, and the future of utilizing AI.

Links:

  • TravelAgent: Generative agents in the built environment - https://journals.sagepub.com/doi/10.1177/23998083251360458
  • Ariel Neumann's websites -
    • https://www.arielnoyman.com/
    • https://www.media.mit.edu/people/noyman/overview/
Show more...
3 months ago
1 hour 7 minutes 5 seconds

The Information Bottleneck
EP5: Speculative Decoding with Nadav Timor

We discussed the inference optimization technique known as Speculative Decoding with a world class researcher, expert, and ex-coworker of the podcast hosts: Nadav Timor.

Papers and links:

  • Accelerating LLM Inference with Lossless Speculative Decoding Algorithms for Heterogeneous Vocabularies, Timor et al, ICML 2025, https://arxiv.org/abs/2502.05202
  • Distributed Speculative Inference (DSI): Speculation Parallelism for Provably Faster Lossless Language Model Inference, Timor et al, ICLR, 2025, https://arxiv.org/abs/2405.14105
  • Fast Inference from Transformers via Speculative Decoding, Leviathan et al, 2022, https://arxiv.org/abs/2502.05202
  • FindPDFs - https://huggingface.co/datasets/HuggingFaceFW/finepdfs

Show more...
3 months ago
1 hour 2 minutes 22 seconds

The Information Bottleneck
EP4: AI Coding

In this episode, Ravid and Allen discuss the evolving landscape of AI coding. They explore the rise of AI-assisted development tools, the challenges faced in software engineering, and the potential future of AI in creative fields. The conversation highlights both the benefits and limitations of AI in coding, emphasizing the need for careful consideration of its impact on the industry and society.

Chapters

00:00Introduction to AI Coding and Recent Developments

03:10OpenAI's Paper on Hallucinations in LLMs

06:03Critique of OpenAI's Research Approach

08:50Copyright Issues in AI Training Data

12:00The Value of Data in AI Training

14:50Watermarking AI Generated Content

17:54The Future of AI Investment and Market Dynamics

20:49AI Coding and Its Impact on Software Development

31:36The Evolution of AI in Software Development

33:54Vibe Coding: The Future or a Fad?

38:24Navigating AI Tools: Personal Experiences and Challenges

41:53The Limitations of AI in Complex Coding Tasks

46:52Security Vulnerabilities in AI-Generated Code

50:28The Role of Human Intuition in AI-Assisted Coding

53:28The Impact of AI on Developer Productivity

56:53The Future of AI in Creative Fields

Show more...
3 months ago
1 hour 3 minutes 1 second

The Information Bottleneck
EP3: GPU Cloud

Allen and Ravid discuss the dynamics associated with the extreme need for GPUs that AI researchers utilize.

Show more...
4 months ago
1 hour 6 minutes 43 seconds

The Information Bottleneck
EP2: PeFT

Allen and Ravid sit down and talk about Parameter Efficient Fine Tuning (PeFT) along with the latest updated in AI/ML news.

Show more...
4 months ago
1 hour 12 minutes 37 seconds

The Information Bottleneck
EP1: Sampling

Allen and Ravid discuss a topic near and dear to their hearts, LLM Sampling!

In this episode of the Information Bottleneck Podcast, Ravid Shwartz-Ziv and Alan Rausch discuss the latest developments in AI, focusing on the controversial release of GPT-5 and its implications for users. They explore the future of large language models and the importance of sampling techniques in AI.

Chapters

00:00 Introduction to the Information Bottleneck Podcast

01:42 The GPT-5 Debacle: Expectations vs. Reality

05:48 Shifting Paradigms in AI Research

09:46 The Future of Large Language Models

12:56 OpenAI's New Model: A Mixed Bag

17:55 Corporate Dynamics in AI: Mergers and Acquisitions

21:39 The GPU Monopoly: Challenges and Opportunities

25:31 Deep Dive into Samplers in AI

35:38 Innovations in Sampling Techniques

42:31 Dynamic Sampling Methods and Their Implications

51:50 Learning Samplers: A New Frontier

59:51 Recent Papers and Their Impact on AI Research

Show more...
4 months ago
1 hour 10 minutes 26 seconds

The Information Bottleneck
Two AI Researchers - Ravid Shwartz Ziv, and Allen Roush, discuss the latest trends, news, and research within Generative AI, LLMs, GPUs, and Cloud Systems.