Home
Categories
EXPLORE
Comedy
True Crime
Society & Culture
History
Sports
News
Health & Fitness
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/34/e4/14/34e41455-8b0d-9b6e-3290-e02f6da69696/mza_1821645445708339756.jpg/600x600bb.jpg
Machine Learning Street Talk (MLST)
Machine Learning Street Talk (MLST)
243 episodes
5 days ago
Welcome! We engage in fascinating discussions with pre-eminent figures in the AI field. Our flagship show covers current affairs in AI, cognitive science, neuroscience and philosophy of mind with in-depth analysis. Our approach is unrivalled in terms of scope and rigour – we believe in intellectual diversity in AI, and we touch on all of the main ideas in the field with the hype surgically removed. MLST is run by Tim Scarfe, Ph.D (https://www.linkedin.com/in/ecsquizor/) and features regular appearances from MIT Doctor of Philosophy Keith Duggar (https://www.linkedin.com/in/dr-keith-duggar/).
Show more...
Technology
RSS
All content for Machine Learning Street Talk (MLST) is the property of Machine Learning Street Talk (MLST) and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
Welcome! We engage in fascinating discussions with pre-eminent figures in the AI field. Our flagship show covers current affairs in AI, cognitive science, neuroscience and philosophy of mind with in-depth analysis. Our approach is unrivalled in terms of scope and rigour – we believe in intellectual diversity in AI, and we touch on all of the main ideas in the field with the hype surgically removed. MLST is run by Tim Scarfe, Ph.D (https://www.linkedin.com/in/ecsquizor/) and features regular appearances from MIT Doctor of Philosophy Keith Duggar (https://www.linkedin.com/in/dr-keith-duggar/).
Show more...
Technology
Episodes (20/243)
Machine Learning Street Talk (MLST)
Bayesian Brain, Scientific Method, and Models [Dr. Jeff Beck]

Dr. Jeff Beck, mathematician turned computational neuroscientist, joins us for a fascinating deep dive into why the future of AI might look less like ChatGPT and more like your own brain.


**SPONSOR MESSAGES START**

—

Prolific - Quality data. From real people. For faster breakthroughs.

https://www.prolific.com/?utm_source=mlst

—

**END**


*What if the key to building truly intelligent machines isn't bigger models, but smarter ones?*


In this conversation, Jeff makes a compelling case that we've been building AI backwards. While the tech industry races to scale up transformers and language models, Jeff argues we're missing something fundamental: the brain doesn't work like a giant prediction engine. It works like a scientist, constantly testing hypotheses about a world made of *objects* that interact through *forces* — not pixels and tokens.


*The Bayesian Brain* — Jeff explains how your brain is essentially running the scientific method on autopilot. When you combine what you see with what you hear, you're doing optimal Bayesian inference without even knowing it. This isn't just philosophy — it's backed by decades of behavioral experiments showing humans are surprisingly efficient at handling uncertainty.


*AutoGrad Changed Everything* — Forget transformers for a moment. Jeff argues the real hero of the AI boom was automatic differentiation, which turned AI from a math problem into an engineering problem. But in the process, we lost sight of what actually makes intelligence work.


*The Cat in the Warehouse Problem* — Here's where it gets practical. Imagine a warehouse robot that's never seen a cat. Current AI would either crash or make something up. Jeff's approach? Build models that *know what they don't know*, can phone a friend to download new object models on the fly, and keep learning continuously. It's like giving robots the ability to say "wait, what IS that?" instead of confidently being wrong.


*Why Language is a Terrible Model for Thought* — In a provocative twist, Jeff argues that grounding AI in language (like we do with LLMs) is fundamentally misguided. Self-report is the least reliable data in psychology — people routinely explain their own behavior incorrectly. We should be grounding AI in physics, not words.


*The Future is Lots of Little Models* — Instead of one massive neural network, Jeff envisions AI systems built like video game engines: thousands of small, modular object models that can be combined, swapped, and updated independently. It's more efficient, more flexible, and much closer to how we actually think.


Rescript: https://app.rescript.info/public/share/D-b494t8DIV-KRGYONJghvg-aelMmxSDjKthjGdYqsE


---

TIMESTAMPS:

00:00:00 Introduction & The Bayesian Brain

00:01:25 Bayesian Inference & Information Processing

00:05:17 The Brain Metaphor: From Levers to Computers

00:10:13 Micro vs. Macro Causation & Instrumentalism

00:16:59 The Active Inference Community & AutoGrad

00:22:54 Object-Centered Models & The Grounding Problem

00:35:50 Scaling Bayesian Inference & Architecture Design

00:48:05 The Cat in the Warehouse: Solving Generalization

00:58:17 Alignment via Belief Exchange

01:05:24 Deception, Emergence & Cellular Automata


---

REFERENCES:

Paper:

[00:00:24] Zoubin Ghahramani (Google DeepMind)

https://pmc.ncbi.nlm.nih.gov/articles/PMC3538441/pdf/rsta201

[00:19:20] Mamba: Linear-Time Sequence Modeling

https://arxiv.org/abs/2312.00752

[00:27:36] xLSTM: Extended Long Short-Term Memory

https://arxiv.org/abs/2405.04517

[00:41:12] 3D Gaussian Splatting

https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/

[01:07:09] Lenia: Biology of Artificial Life

https://arxiv.org/abs/1812.05433

[01:08:20] Growing Neural Cellular Automata

https://distill.pub/2020/growing-ca/

[01:14:05] DreamCoder

https://arxiv.org/abs/2006.08381

[01:14:58] The Genomic Bottleneck

https://www.nature.com/articles/s41467-019-11786-6

Person:

[00:16:42] Karl Friston (UCL)

https://www.youtube.com/watch?v=PNYWi996Beg

Show more...
1 week ago
1 hour 16 minutes 37 seconds

Machine Learning Street Talk (MLST)
Your Brain is Running a Simulation Right Now [Max Bennett]

Tim sits down with Max Bennett to explore how our brains evolved over 600 million years—and what that means for understanding both human intelligence and AI.


Max isn't a neuroscientist by training. He's a tech entrepreneur who got curious, started reading, and ended up weaving together three fields that rarely talk to each other: comparative psychology (what different animals can actually do), evolutionary neuroscience (how brains changed over time), and AI (what actually works in practice).


*Your Brain Is a Guessing Machine*

You don't actually "see" the world. Your brain builds a simulation of what it *thinks* is out there and just uses your eyes to check if it's right. That's why optical illusions work—your brain is filling in a triangle that isn't there, or can't decide if it's looking at a duck or a rabbit.


*Rats Have Regrets*

*Chimps Are Machiavellian*

*Language Is the Human Superpower*

*Does ChatGPT Think?*


(truncated description, more on rescript)


Understanding how the brain evolved isn't just about the past. It gives us clues about:

- What's actually different between human intelligence and AI

- Why we're so easily fooled by status games and tribal thinking

- What features we might want to build into—or leave out of—future AI systems


Get Max's book:

https://www.amazon.com/Brief-History-Intelligence-Humans-Breakthroughs/dp/0063286343


Rescript: https://app.rescript.info/public/share/R234b7AXyDXZusqQ_43KMGsUSvJ2TpSz2I3emnI6j9A


---

TIMESTAMPS:

00:00:00 Introduction: Outsider's Advantage & Neocortex Theories

00:11:34 Perception as Inference: The Filling-In Machine

00:19:11 Understanding, Recognition & Generative Models

00:36:39 How Mice Plan: Vicarious Trial & Error

00:46:15 Evolution of Self: The Layer 4 Mystery

00:58:31 Ancient Minds & The Social Brain: Machiavellian Apes

01:19:36 AI Alignment, Instrumental Convergence & Status Games

01:33:07 Metacognition & The IQ Paradox

01:48:40 Does GPT Have Theory of Mind?

02:00:40 Memes, Language Singularity & Brain Size Myths

02:16:44 Communication, Language & The Cyborg Future

02:44:25 Shared Fictions, World Models & The Reality Gap


---

REFERENCES:Person:

[00:00:05] Karl Friston (UCL)

https://www.youtube.com/watch?v=PNYWi996Beg

[00:00:06] Jeff Hawkins

https://www.youtube.com/watch?v=6VQILbDqaI4

[00:12:19] Hermann von Helmholtz

https://plato.stanford.edu/entries/hermann-helmholtz/

[00:38:34] David Redish (U. Minnesota)

https://redishlab.umn.edu/

[01:10:19] Robin Dunbar

https://www.psy.ox.ac.uk/people/robin-dunbar

[01:15:04] Emil Menzel

https://www.sciencedirect.com/bookseries/behavior-of-nonhuman-primates/vol/5/suppl/C

[01:19:49] Nick Bostrom

https://nickbostrom.com/

[02:28:25] Noam Chomsky

https://linguistics.mit.edu/user/chomsky/

[03:01:22] Judea Pearl

https://samueli.ucla.edu/people/judea-pearl/

Concept/Framework:

[00:05:04] Active Inference

https://www.youtube.com/watch?v=KkR24ieh5Ow

Paper:

[00:35:59] Predictions not commands [Rick A Adams]

https://pubmed.ncbi.nlm.nih.gov/23129312/

Book:

[01:25:42] The Elephant in the Brain

https://www.amazon.com/Elephant-Brain-Hidden-Motives-Everyday/dp/0190495995

[01:28:27] The Status Game

https://www.goodreads.com/book/show/58642436-the-status-game

[02:00:40] The Selfish Gene

https://amazon.com/dp/0198788606

[02:14:25] The Language Game

https://www.amazon.com/Language-Game-Improvisation-Created-Changed/dp/1541674987

[02:54:40] The Evolution of Language

https://www.amazon.com/Evolution-Language-Approaches/dp/052167736X

[03:09:37] The Three-Body Problem

https://amazon.com/dp/0765377063

Show more...
1 week ago
3 hours 17 minutes 9 seconds

Machine Learning Street Talk (MLST)
The 3 Laws of Knowledge [César Hidalgo]

César Hidalgo has spent years trying to answer a deceptively simple question: What is knowledge, and why is it so hard to move around?


We all have this intuition that knowledge is just... information. Write it down in a book, upload it to GitHub, train an AI on it—done. But César argues that's completely wrong. Knowledge isn't a thing you can copy and paste. It's more like a living organism that needs the right environment, the right people, and constant exercise to survive.


Guest: César Hidalgo, Director of the Center for Collective Learning


1. Knowledge Follows Laws (Like Physics)

2. You Can't Download Expertise

3. Why Big Companies Fail to Adapt

4. The "Infinite Alphabet" of Economies


If you think AI can just "copy" human knowledge, or that development is just about throwing money at poor countries, or that writing things down preserves them forever—this conversation will change your mind. Knowledge is fragile, specific, and collective. It decays fast if you don't use it.


The Infinite Alphabet [César A. Hidalgo]

https://www.penguin.co.uk/books/458054/the-infinite-alphabet-by-hidalgo-cesar-a/9780241655672

https://x.com/cesifoti


Rescript link.

https://app.rescript.info/public/share/eaBHbEo9xamwbwpxzcVVm4NQjMh7lsOQKeWwNxmw0JQ


---

TIMESTAMPS:

00:00:00 The Three Laws of Knowledge

00:02:28 Rival vs. Non-Rival: The Economics of Ideas

00:05:43 Why You Can't Just 'Download' Knowledge

00:08:11 The Detective Novel Analogy

00:11:54 Collective Learning & Organizational Networks

00:16:27 Architectural Innovation: Amazon vs. Barnes & Noble

00:19:15 The First Law: Learning Curves

00:23:05 The Samuel Slater Story: Treason & Memory

00:28:31 Physics of Knowledge: Joule's Cannon

00:32:33 Extensive vs. Intensive Properties

00:35:45 Knowledge Decay: Ise Temple & Polaroid

00:41:20 Absorptive Capacity: Sony & Donetsk

00:47:08 Disruptive Innovation & S-Curves

00:51:23 Team Size & The Cost of Innovation

00:57:13 Geography of Knowledge: Vespa's Origin

01:04:34 Migration, Diversity & 'Planet China'

01:12:02 Institutions vs. Knowledge: The China Story

01:21:27 Economic Complexity & The Infinite Alphabet

01:32:27 Do LLMs Have Knowledge?


---

REFERENCES:

Book:

[00:47:45] The Innovator's Dilemma (Christensen)

https://www.amazon.com/Innovators-Dilemma-Revolutionary-Change-Business/dp/0062060244

[00:55:15] Why Greatness Cannot Be Planned

https://amazon.com/dp/3319155237

[01:35:00] Why Information Grows

https://amazon.com/dp/0465048994

Paper:

[00:03:15] Endogenous Technological Change (Romer, 1990)

https://web.stanford.edu/~klenow/Romer_1990.pdf

[00:03:30] A Model of Growth Through Creative Destruction (Aghion & Howitt, 1992)

https://dash.harvard.edu/server/api/core/bitstreams/7312037d-2b2d-6bd4-e053-0100007fdf3b/content

[00:14:55] Organizational Learning: From Experience to Knowledge (Argote & Miron-Spektor, 2011)

https://www.researchgate.net/publication/228754233_Organizational_Learning_From_Experience_to_Knowledge

[00:17:05] Architectural Innovation (Henderson & Clark, 1990)

https://www.researchgate.net/publication/200465578_Architectural_Innovation_The_Reconfiguration_of_Existing_Product_Technologies_and_the_Failure_of_Established_Firms

[00:19:45] The Learning Curve Equation (Thurstone, 1916)

https://dn790007.ca.archive.org/0/items/learningcurveequ00thurrich/learningcurveequ00thurrich.pdf

[00:21:30] Factors Affecting the Cost of Airplanes (Wright, 1936)

https://pdodds.w3.uvm.edu/research/papers/others/1936/wright1936a.pdf

[00:52:45] Are Ideas Getting Harder to Find? (Bloom et al.)

https://web.stanford.edu/~chadj/IdeaPF.pdf

[01:33:00] LLMs/ Emergence

https://arxiv.org/abs/2506.11135

Person:

[00:25:30] Samuel Slater

https://en.wikipedia.org/wiki/Samuel_Slater

[00:42:05] Masaru Ibuka (Sony)

https://www.sony.com/en/SonyInfo/CorporateInfo/History/SonyHistory/1-02.html


Show more...
1 week ago
1 hour 37 minutes 5 seconds

Machine Learning Street Talk (MLST)
"I Desperately Want To Live In The Matrix" - Dr. Mike Israetel

This is a lively, no-holds-barred debate about whether AI can truly be intelligent, conscious, or understand anything at all — and what happens when (or if) machines become smarter than us.


Dr. Mike Israetel is a sports scientist, entrepreneur, and co-founder of RP Strength (a fitness company). He describes himself as a "dilettante" in AI but brings a fascinating outsider's perspective.


Jared Feather (IFBB Pro bodybuilder and exercise physiologist)


The Big Questions:


1. When is superintelligence coming?

2. Does AI actually understand anything?

3. The Simulation Debate (The Spiciest Part)

4. Will AI kill us all? (The Doomer Debate)

5. What happens to human jobs and purpose?

6. Do we need suffering?


Mikes channel: https://www.youtube.com/channel/UCfQgsKhHjSyRLOp9mnffqVg


RESCRIPT INTERACTIVE PLAYER: https://app.rescript.info/public/share/GVMUXHCqctPkXH8WcYtufFG7FQcdJew_RL_MLgMKU1U


---

TIMESTAMPS:

00:00:00 Introduction & Workout Demo

00:04:15 ASI Timelines & Definitions

00:10:24 The Embodiment Debate

00:18:28 Neutrinos & Abstract Knowledge

00:25:56 Can AI Learn From YouTube?

00:31:25 Diversity of Intelligence

00:36:00 AI Slop & Understanding

00:45:18 The Simulation Argument: Fire & Water

00:58:36 Consciousness & Zombies

01:04:30 Do Reasoning Models Actually Reason?

01:12:00 The Live Learning Problem

01:19:15 Superintelligence & Benevolence

01:28:59 What is True Agency?

01:37:20 Game Theory & The "Kill All Humans" Fallacy

01:48:05 Regulation & The China Factor

01:55:52 Mind Uploading & The Future of Love

02:04:41 Economics of ASI: Will We Be Useless?

02:13:35 The Matrix & The Value of Suffering

02:17:30 Transhumanism & Inequality

02:21:28 Debrief: AI Medical Advice & Final Thoughts


---

REFERENCES:

Paper:

[00:10:45] Alchemy and Artificial Intelligence (Dreyfus)

https://www.rand.org/content/dam/rand/pubs/papers/2006/P3244.pdf

[00:10:55] The Chinese Room Argument (John Searle)

https://home.csulb.edu/~cwallis/382/readings/482/searle.minds.brains.programs.bbs.1980.pdf

[00:11:05] The Symbol Grounding Problem (Stephen Harnad)

https://arxiv.org/html/cs/9906002

[00:23:00] Attention Is All You Need

https://arxiv.org/abs/1706.03762

[00:45:00] GPT-4 Technical Report

https://arxiv.org/abs/2303.08774

[01:45:00] Anthropic Agentic Misalignment Paper

https://www.anthropic.com/research/agentic-misalignment

[02:17:45] Retatrutide

https://pubmed.ncbi.nlm.nih.gov/37366315/

Organization:

[00:15:50] CERN

https://home.cern/

[01:05:00] METR Long Horizon Evaluations

https://evaluations.metr.org/

MLST Episode:

[00:23:10] MLST: Llion Jones - Inventors' Remorse

https://www.youtube.com/watch?v=DtePicx_kFY

[00:50:30] MLST: Blaise Agüera y Arcas Interview

https://www.youtube.com/watch?v=rMSEqJ_4EBk

[01:10:00] MLST: David Krakauer

https://www.youtube.com/watch?v=dY46YsGWMIc

Event:

[00:23:40] ARC Prize/Challenge

https://arcprize.org/

Book:

[00:24:45] The Brain Abstracted

https://www.amazon.com/Brain-Abstracted-Simplification-Philosophy-Neuroscience/dp/0262548046

[00:47:55] Pamela McCorduck

https://www.amazon.com/Machines-Who-Think-Artificial-Intelligence/dp/1568812051

[01:23:15] The Singularity Is Nearer (Ray Kurzweil)

https://www.amazon.com/Singularity-Nearer-Ray-Kurzweil-ebook/dp/B08Y6FYJVY

[01:27:35] A Fire Upon The Deep (Vernor Vinge)

https://www.amazon.com/Fire-Upon-Deep-S-F-MASTERWORKS-ebook/dp/B00AVUMIZE/

[02:04:50] Deep Utopia (Nick Bostrom)

https://www.amazon.com/Deep-Utopia-Meaning-Solved-World/dp/1646871642

[02:05:00] Technofeudalism (Yanis Varoufakis)

https://www.amazon.com/Technofeudalism-Killed-Capitalism-Yanis-Varoufakis/dp/1685891241

Visual Context Needed:

[00:29:40] AT-AT Walker (Star Wars)

https://starwars.fandom.com/wiki/All_Terrain_Armored_Transport

Person:

[00:33:15] Andrej Karpathy

https://karpathy.ai/

Video:

[01:40:00] Mike Israetel vs Liron Shapira AI Doom Debate

https://www.youtube.com/watch?v=RaDWSPMdM4o

Company:

[02:26:30] Examine.com

https://examine.com/

Show more...
2 weeks ago
2 hours 55 minutes 46 seconds

Machine Learning Street Talk (MLST)
Making deep learning perform real algorithms with Category Theory (Andrew Dudzik, Petar Velichkovich, Taco Cohen, Bruno Gavranović, Paul Lessard)

We often think of Large Language Models (LLMs) as all-knowing, but as the team reveals, they still struggle with the logic of a second-grader. Why can’t ChatGPT reliably add large numbers? Why does it "hallucinate" the laws of physics? The answer lies in the architecture. This episode explores how *Category Theory* —an ultra-abstract branch of mathematics—could provide the "Periodic Table" for neural networks, turning the "alchemy" of modern AI into a rigorous science.


In this deep-dive exploration, *Andrew Dudzik*, *Petar Velichkovich*, *Taco Cohen*, *Bruno Gavranović*, and *Paul Lessard* join host *Tim Scarfe* to discuss the fundamental limitations of today’s AI and the radical mathematical framework that might fix them.


TRANSCRIPT:

https://app.rescript.info/public/share/LMreunA-BUpgP-2AkuEvxA7BAFuA-VJNAp2Ut4MkMWk


---


Key Insights in This Episode:


* *The "Addition" Problem:* *Andrew Dudzik* explains why LLMs don't actually "know" math—they just recognize patterns. When you change a single digit in a long string of numbers, the pattern breaks because the model lacks the internal "machinery" to perform a simple carry operation.

* *Beyond Alchemy:* deep learning is currently in its "alchemy" phase—we have powerful results, but we lack a unifying theory. Category Theory is proposed as the framework to move AI from trial-and-error to principled engineering. [00:13:49]

* *Algebra with Colors:* To make Category Theory accessible, the guests use brilliant analogies—like thinking of matrices as *magnets with colors* that only snap together when the types match. This "partial compositionality" is the secret to building more complex internal reasoning. [00:09:17]

* *Synthetic vs. Analytic Math:* *Paul Lessard* breaks down the philosophical shift needed in AI research: moving from "Analytic" math (what things are made of) to "Synthetic" math [00:23:41]


---


Why This Matters for AGI

If we want AI to solve the world's hardest scientific problems, it can't just be a "stochastic parrot." It needs to internalize the rules of logic and computation. By imbuing neural networks with categorical priors, researchers are attempting to build a future where AI doesn't just predict the next word—it understands the underlying structure of the universe.


---

TIMESTAMPS:

00:00:00 The Failure of LLM Addition & Physics

00:01:26 Tool Use vs Intrinsic Model Quality

00:03:07 Efficiency Gains via Internalization

00:04:28 Geometric Deep Learning & Equivariance

00:07:05 Limitations of Group Theory

00:09:17 Category Theory: Algebra with Colors

00:11:25 The Systematic Guide of Lego-like Math

00:13:49 The Alchemy Analogy & Unifying Theory

00:15:33 Information Destruction & Reasoning

00:18:00 Pathfinding & Monoids in Computation

00:20:15 System 2 Reasoning & Error Awareness

00:23:31 Analytic vs Synthetic Mathematics

00:25:52 Morphisms & Weight Tying Basics

00:26:48 2-Categories & Weight Sharing Theory

00:28:55 Higher Categories & Emergence

00:31:41 Compositionality & Recursive Folds

00:34:05 Syntax vs Semantics in Network Design

00:36:14 Homomorphisms & Multi-Sorted Syntax

00:39:30 The Carrying Problem & Hopf Fibrations


Petar Veličković (GDM)

https://petar-v.com/

Paul Lessard

https://www.linkedin.com/in/paul-roy-lessard/

Bruno Gavranović

https://www.brunogavranovic.com/

Andrew Dudzik (GDM)

https://www.linkedin.com/in/andrew-dudzik-222789142/


---

REFERENCES:


Model:

[00:01:05] Veo

https://deepmind.google/models/veo/

[00:01:10] Genie

https://deepmind.google/blog/genie-3-a-new-frontier-for-world-models/

Paper:

[00:04:30] Geometric Deep Learning Blueprint

https://arxiv.org/abs/2104.13478

https://www.youtube.com/watch?v=bIZB1hIJ4u8

[00:16:45] AlphaGeometry

https://arxiv.org/abs/2401.08312

[00:16:55] AlphaCode

https://arxiv.org/abs/2203.07814

[00:17:05] FunSearch

https://www.nature.com/articles/s41586-023-06924-6

[00:37:00] Attention Is All You Need

https://arxiv.org/abs/1706.03762

[00:43:00] Categorical Deep Learning

https://arxiv.org/abs/2402.15332

Show more...
2 weeks ago
43 minutes 57 seconds

Machine Learning Street Talk (MLST)
Are AI Benchmarks Telling The Full Story? [SPONSORED] (Andrew Gordon and Nora Petrova - Prolific)

Is a car that wins a Formula 1 race the best choice for your morning commute? Probably not. In this sponsored deep dive with Prolific, we explore why the same logic applies to Artificial Intelligence. While models are currently shattering records on technical exams, they often fail the most important test of all: **the human experience.**


Why High Benchmark Scores Don’t Mean Better AI


Joining us are **Andrew Gordon** (Staff Researcher in Behavioral Science) and **Nora Petrova** (AI Researcher) from **Prolific**. They reveal the hidden flaws in how we currently rank AI and introduce a more rigorous, "humane" way to measure whether these models are actually helpful, safe, and relatable for real people.


---


Key Insights in This Episode:


* *The F1 Car Analogy:* Andrew explains why a model that excels at the "Humanities Last Exam" might be a nightmare for daily use. Technical benchmarks often ignore the nuances of human communication and adaptability.

* *The "Wild West" of AI Safety:* As users turn to AI for sensitive topics like mental health, Nora highlights the alarming lack of oversight and the "thin veneer" of safety training—citing recent controversial incidents like Grok-3’s "Mecha Hitler."

* *Fixing the "Leaderboard Illusion":* The team critiques current popular rankings like Chatbot Arena, discussing how anonymous, unstratified voting can lead to biased results and how companies can "game" the system.

* *The Xbox Secret to AI Ranking:* Discover how Prolific uses *TrueSkill*—the same algorithm Microsoft developed for Xbox Live matchmaking—to create a fairer, more statistically sound leaderboard for LLMs.

* *The Personality Gap:* Early data from the **Humane Leaderboard** suggests that while AI is getting smarter, it is actually performing *worse* on metrics like personality, culture, and "sycophancy" (the tendency for models to become annoying "people-pleasers").


---


About the HUMAINE Leaderboard

Moving beyond simple "A vs. B" testing, the researchers discuss their new framework that samples participants based on *census data* (Age, Ethnicity, Political Alignment). By using a representative sample of the general public rather than just tech enthusiasts, they are building a standard that reflects the values of the real world.


*Are we building models for benchmarks, or are we building them for humans? It’s time to change the scoreboard.*


Rescript link:

https://app.rescript.info/public/share/IDqwjY9Q43S22qSgL5EkWGFymJwZ3SVxvrfpgHZLXQc


---

TIMESTAMPS:

00:00:00 Introduction & The Benchmarking Problem

00:01:58 The Fractured State of AI Evaluation

00:03:54 AI Safety & Interpretability

00:05:45 Bias in Chatbot Arena

00:06:45 Prolific's Three Pillars Approach

00:09:01 TrueSkill Ranking & Efficient Sampling

00:12:04 Census-Based Representative Sampling

00:13:00 Key Findings: Culture, Personality & Sycophancy


---

REFERENCES:

Paper:

[00:00:15] MMLU

https://arxiv.org/abs/2009.03300

[00:05:10] Constitutional AI

https://arxiv.org/abs/2212.08073

[00:06:45] The Leaderboard Illusion

https://arxiv.org/abs/2504.20879

[00:09:41] HUMAINE Framework Paper

https://huggingface.co/blog/ProlificAI/humaine-framework

Company:

[00:00:30] Prolific

https://www.prolific.com

[00:01:45] Chatbot Arena

https://lmarena.ai/

Person:

[00:00:35] Andrew Gordon

https://www.linkedin.com/in/andrew-gordon-03879919a/

[00:00:45] Nora Petrova

https://www.linkedin.com/in/nora-petrova/

Event:

Algorithm:

[00:09:01] Microsoft TrueSkill

https://www.microsoft.com/en-us/research/project/trueskill-ranking-system/

Leaderboard:

[00:09:21] Prolific HUMAINE Leaderboard

https://www.prolific.com/humaine

[00:09:31] HUMAINE HuggingFace Space

https://huggingface.co/spaces/ProlificAI/humaine-leaderboard

[00:10:21] Prolific AI Leaderboard Portal

https://www.prolific.com/leaderboard

Dataset:

[00:09:51] Prolific Social Reasoning RLHF Dataset

https://huggingface.co/datasets/ProlificAI/social-reasoning-rlhf

Organization:

[00:10:31] MLCommons

https://mlcommons.org/

Show more...
2 weeks ago
16 minutes 4 seconds

Machine Learning Street Talk (MLST)
The Mathematical Foundations of Intelligence [Professor Yi Ma]

What if everything we think we know about AI understanding is wrong? Is compression the key to intelligence? Or is there something more—a leap from memorization to true abstraction?


In this fascinating conversation, we sit down with **Professor Yi Ma**—world-renowned expert in deep learning, IEEE/ACM Fellow, and author of the groundbreaking new book *Learning Deep Representations of Data Distributions*. Professor Ma challenges our assumptions about what large language models actually do, reveals why 3D reconstruction isn't the same as understanding, and presents a unified mathematical theory of intelligence built on just two principles: **parsimony** and **self-consistency**.


**SPONSOR MESSAGES START**

—

Prolific - Quality data. From real people. For faster breakthroughs.

https://www.prolific.com/?utm_source=mlst

—

cyber•Fund https://cyber.fund/?utm_source=mlst is a founder-led investment firm accelerating the cybernetic economy

Hiring a SF VC Principal: https://talent.cyber.fund/companies/cyber-fund-2/jobs/57674170-ai-investment-principal#content?utm_source=mlst

Submit investment deck: https://cyber.fund/contact?utm_source=mlst

—

**END**


Key Insights:


**LLMs Don't Understand—They Memorize**

Language models process text (*already* compressed human knowledge) using the same mechanism we use to learn from raw data.


**The Illusion of 3D Vision**

Sora and NeRFs etc that can reconstruct 3D scenes still fail miserably at basic spatial reasoning


**"All Roads Lead to Rome"**

Why adding noise is *necessary* for discovering structure.


**Why Gradient Descent Actually Works**

Natural optimization landscapes are surprisingly smooth—a "blessing of dimensionality"


**Transformers from First Principles**

Transformer architectures can be mathematically derived from compression principles


—


INTERACTIVE AI TRANSCRIPT PLAYER w/REFS (ReScript):

https://app.rescript.info/public/share/Z-dMPiUhXaeMEcdeU6Bz84GOVsvdcfxU_8Ptu6CTKMQ


About Professor Yi Ma


Yi Ma is the inaugural director of the School of Computing and Data Science at Hong Kong University and a visiting professor at UC Berkeley.


https://people.eecs.berkeley.edu/~yima/

https://scholar.google.com/citations?user=XqLiBQMAAAAJ&hl=en

https://x.com/YiMaTweets


**Slides from this conversation:**

https://www.dropbox.com/scl/fi/sbhbyievw7idup8j06mlr/slides.pdf?rlkey=7ptovemezo8bj8tkhfi393fh9&dl=0


**Related Talks by Professor Ma:**

- Pursuing the Nature of Intelligence (ICLR): https://www.youtube.com/watch?v=LT-F0xSNSjo

- Earlier talk at Berkeley: https://www.youtube.com/watch?v=TihaCUjyRLM


TIMESTAMPS:

00:00:00 Introduction

00:02:08 The First Principles Book & Research Vision

00:05:21 Two Pillars: Parsimony & Consistency

00:09:50 Evolution vs. Learning: The Compression Mechanism

00:14:36 LLMs: Memorization Masquerading as Understanding

00:19:55 The Leap to Abstraction: Empirical vs. Scientific

00:27:30 Platonism, Deduction & The ARC Challenge

00:35:57 Specialization & The Cybernetic Legacy

00:41:23 Deriving Maximum Rate Reduction

00:48:21 The Illusion of 3D Understanding: Sora & NeRF

00:54:26 All Roads Lead to Rome: The Role of Noise

00:59:56 All Roads Lead to Rome: The Role of Noise

01:00:14 Benign Non-Convexity: Why Optimization Works

01:06:35 Double Descent & The Myth of Overfitting

01:14:26 Self-Consistency: Closed-Loop Learning

01:21:03 Deriving Transformers from First Principles

01:30:11 Verification & The Kevin Murphy Question

01:34:11 CRATE vs. ViT: White-Box AI & Conclusion


REFERENCES:

Book:

[00:03:04] Learning Deep Representations of Data Distributions

https://ma-lab-berkeley.github.io/deep-representation-learning-book/

[00:18:38] A Brief History of Intelligence

https://www.amazon.co.uk/BRIEF-HISTORY-INTELLIGEN-HB-Evolution/dp/0008560099

[00:38:14] Cybernetics

https://mitpress.mit.edu/9780262730099/cybernetics/

Book (Yi Ma):

[00:03:14] 3-D Vision book

https://link.springer.com/book/10.1007/978-0-387-21779-6

<TRUNC> refs on ReScript link/YT

Show more...
3 weeks ago
1 hour 39 minutes 14 seconds

Machine Learning Street Talk (MLST)
Pedro Domingos: Tensor Logic Unifies AI Paradigms

Pedro Domingos, author of the bestselling book "The Master Algorithm," introduces his latest work: Tensor Logic - a new programming language he believes could become the fundamental language for artificial intelligence.


Think of it like this: Physics found its language in calculus. Circuit design found its language in Boolean logic. Pedro argues that AI has been missing its language - until now.


**SPONSOR MESSAGES START**

—

Build your ideas with AI Studio from Google - http://ai.studio/build

—

Prolific - Quality data. From real people. For faster breakthroughs.

https://www.prolific.com/?utm_source=mlst

—

cyber•Fund https://cyber.fund/?utm_source=mlst is a founder-led investment firm accelerating the cybernetic economy

Hiring a SF VC Principal: https://talent.cyber.fund/companies/cyber-fund-2/jobs/57674170-ai-investment-principal#content?utm_source=mlst

Submit investment deck: https://cyber.fund/contact?utm_source=mlst

—

**END**


Current AI is split between two worlds that don't play well together:


Deep Learning (neural networks, transformers, ChatGPT) - great at learning from data, terrible at logical reasoning

Symbolic AI (logic programming, expert systems) - great at logical reasoning, terrible at learning from messy real-world data


Tensor Logic unifies both. It's a single language where you can:

Write logical rules that the system can actually learn and modify

Do transparent, verifiable reasoning (no hallucinations)

Mix "fuzzy" analogical thinking with rock-solid deduction


INTERACTIVE TRANSCRIPT:

https://app.rescript.info/public/share/NP4vZQ-GTETeN_roB2vg64vbEcN7isjJtz4C86WSOhw


TOC:

00:00:00 - Introduction

00:04:41 - What is Tensor Logic?

00:09:59 - Tensor Logic vs PyTorch & Einsum

00:17:50 - The Master Algorithm Connection

00:20:41 - Predicate Invention & Learning New Concepts

00:31:22 - Symmetries in AI & Physics

00:35:30 - Computational Reducibility & The Universe

00:43:34 - Technical Details: RNN Implementation

00:45:35 - Turing Completeness Debate

00:56:45 - Transformers vs Turing Machines

01:02:32 - Reasoning in Embedding Space

01:11:46 - Solving Hallucination with Deductive Modes

01:16:17 - Adoption Strategy & Migration Path

01:21:50 - AI Education & Abstraction

01:24:50 - The Trillion-Dollar Waste


REFS

Tensor Logic: The Language of AI [Pedro Domingos]

https://arxiv.org/abs/2510.12269

The Master Algorithm [Pedro Domingos]

https://www.amazon.co.uk/Master-Algorithm-Ultimate-Learning-Machine/dp/0241004543

Einsum is All you Need (TIM ROCKTÄSCHEL)

https://rockt.ai/2018/04/30/einsum

https://www.youtube.com/watch?v=6DrCq8Ry2cw

Autoregressive Large Language Models are Computationally Universal (Dale Schuurmans et al - GDM)

https://arxiv.org/abs/2410.03170

Memory Augmented Large Language Models are Computationally Universal [Dale Schuurmans]

https://arxiv.org/pdf/2301.04589

On the computational power of NNs [95/Siegelmann]

https://binds.cs.umass.edu/papers/1995_Siegelmann_JComSysSci.pdf

Sebastian Bubeck

https://www.reddit.com/r/OpenAI/comments/1oacp38/openai_researcher_sebastian_bubeck_falsely_claims/

I am a strange loop - Hofstadter

https://www.amazon.co.uk/Am-Strange-Loop-Douglas-Hofstadter/dp/0465030793

Stephen Wolfram

https://www.youtube.com/watch?v=dkpDjd2nHgo

The Complex World: An Introduction to the Foundations of Complexity Science [David C. Krakauer]

https://www.amazon.co.uk/Complex-World-Introduction-Foundations-Complexity/dp/1947864629

Geometric Deep Learning

https://www.youtube.com/watch?v=bIZB1hIJ4u8

Andrew Wilson (NYU)

https://www.youtube.com/watch?v=M-jTeBCEGHc

Yi Ma

https://www.patreon.com/posts/yi-ma-scientific-141953348

Roger Penrose - road to reality

https://www.amazon.co.uk/Road-Reality-Complete-Guide-Universe/dp/0099440687

Artificial Intelligence: A Modern Approach [Russel and Norvig]

https://www.amazon.co.uk/Artificial-Intelligence-Modern-Approach-Global/dp/1292153962

Show more...
1 month ago
1 hour 27 minutes 48 seconds

Machine Learning Street Talk (MLST)
He Co-Invented the Transformer. Now: Continuous Thought Machines - Llion Jones and Luke Darlow [Sakana AI]

The Transformer architecture (which powers ChatGPT and nearly all modern AI) might be trapping the industry in a localized rut, preventing us from finding true intelligent reasoning, according to the person who co-invented it. Llion Jones and Luke Darlow, key figures at the research lab Sakana AI, join the show to make this provocative argument, and also introduce new research which might lead the way forwards.


**SPONSOR MESSAGES START**

—

Build your ideas with AI Studio from Google - http://ai.studio/build

—

Tufa AI Labs is hiring ML Research Engineers https://tufalabs.ai/

—

cyber•Fund https://cyber.fund/?utm_source=mlst is a founder-led investment firm accelerating the cybernetic economy

Hiring a SF VC Principal: https://talent.cyber.fund/companies/cyber-fund-2/jobs/57674170-ai-investment-principal#content?utm_source=mlst

Submit investment deck: https://cyber.fund/contact?utm_source=mlst

—

**END**


The "Spiral" Problem – Llion uses a striking visual analogy to explain what current AI is missing. If you ask a standard neural network to understand a spiral shape, it solves it by drawing tiny straight lines that just happen to look like a spiral. It "fakes" the shape without understanding the concept of spiraling.


Introducing the Continuous Thought Machine (CTM) Luke Darlow deep dives into their solution: a biology-inspired model that fundamentally changes how AI processes information.


The Maze Analogy: Luke explains that standard AI tries to solve a maze by staring at the whole image and guessing the entire path instantly. Their new machine "walks" through the maze step-by-step.

Thinking Time: This allows the AI to "ponder." If a problem is hard, the model can naturally spend more time thinking about it before answering, effectively allowing it to correct its own mistakes and backtrack—something current Language Models struggle to do genuinely.


https://sakana.ai/

https://x.com/YesThisIsLion

https://x.com/LearningLukeD


TRANSCRIPT:

https://app.rescript.info/public/share/crjzQ-Jo2FQsJc97xsBdfzfOIeMONpg0TFBuCgV2Fu8


TOC:

00:00:00 - Stepping Back from Transformers

00:00:43 - Introduction to Continuous Thought Machines (CTM)

00:01:09 - The Changing Atmosphere of AI Research

00:04:13 - Sakana’s Philosophy: Research Freedom

00:07:45 - The Local Minimum of Large Language Models

00:18:30 - Representation Problems: The Spiral Example

00:29:12 - Technical Deep Dive: CTM Architecture

00:36:00 - Adaptive Computation & Maze Solving

00:47:15 - Model Calibration & Uncertainty

01:00:43 - Sudoku Bench: Measuring True Reasoning



REFS:

Why Greatness Cannot be planned [Kenneth Stanley]

https://www.amazon.co.uk/Why-Greatness-Cannot-Planned-Objective/dp/3319155237

https://www.youtube.com/watch?v=lhYGXYeMq_E


The Hardware Lottery [Sara Hooker]

https://arxiv.org/abs/2009.06489

https://www.youtube.com/watch?v=sQFxbQ7ade0


Continuous Thought Machines [Luke Darlow et al / Sakana]

https://arxiv.org/abs/2505.05522

https://sakana.ai/ctm/


LSTM: The Comeback Story? [Prof. Sepp Hochreiter]

https://www.youtube.com/watch?v=8u2pW2zZLCs


Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation Hypothesis [Kumar/Stanley]

https://arxiv.org/pdf/2505.11581


A Spline Theory of Deep Networks [Randall Balestriero]

https://proceedings.mlr.press/v80/balestriero18b/balestriero18b.pdf

https://www.youtube.com/watch?v=86ib0sfdFtw

https://www.youtube.com/watch?v=l3O2J3LMxqI


On the Biology of a Large Language Model [Anthropic, Jack Lindsey et al]

https://transformer-circuits.pub/2025/attribution-graphs/biology.html


The ARC Prize 2024 Winning Algorithm [Daniel Franzen and Jan Disselhoff] “The ARChitects”

https://www.youtube.com/watch?v=mTX_sAq--zY


Neural Turing Machine [Graves]

https://arxiv.org/pdf/1410.5401


Adaptive Computation Time for Recurrent Neural Networks [Graves]

https://arxiv.org/abs/1603.08983


Sudoko Bench [Sakana]

https://pub.sakana.ai/sudoku/

Show more...
1 month ago
1 hour 12 minutes 39 seconds

Machine Learning Street Talk (MLST)
Why Humans Are Still Powering AI [Sponsored]

Ever wonder where AI models actually get their "intelligence"? We reveal the dirty secret of Silicon Valley: behind every impressive AI system are thousands of real humans providing crucial data, feedback, and expertise.Guest: Phelim Bradley, CEO and Co-founder of ProlificPhelim Bradley runs Prolific, a platform that connects AI companies with verified human experts who help train and evaluate their models. Think of it as a sophisticated marketplace matching the right human expertise to the right AI task - whether that's doctors evaluating medical chatbots or coders reviewing AI-generated software.Prolific: https://prolific.com/?utm_source=mlsthttps://uk.linkedin.com/in/phelim-bradley-84300826The discussion dives into:**The human data pipeline**: How AI companies rely on human intelligence to train, refine, and validate their models - something rarely discussed openly**Quality over quantity**: Why paying humans well and treating them as partners (not commodities) produces better AI training data**The matching challenge**: How Prolific solves the complex problem of finding the right expert for each specific task, similar to matching Uber drivers to riders but with deep expertise requirements**Future of work**: What it means when human expertise becomes an on-demand service, and why this might actually create more opportunities rather than fewer**Geopolitical implications**: Why the centralization of AI development in US tech companies should concern Europe and the UK

Show more...
2 months ago
24 minutes 19 seconds

Machine Learning Street Talk (MLST)
The Universal Hierarchy of Life - Prof. Chris Kempes [SFI]

"What is life?" - asks Chris Kempes, a professor at the Santa Fe Institute.


Chris explains that scientists are moving beyond a purely Earth-based, biological view and are searching for a universal theory of life that could apply to anything, anywhere in the universe. He proposes that things we don't normally consider "alive"—like human culture, language, or even artificial intelligence; could be seen as life forms existing on different "substrates".


To understand this, Chris presents a fascinating three-level framework:


- Materials: The physical stuff life is made of. He argues this could be incredibly diverse across the universe, and we shouldn't expect alien life to share our biochemistry.


- Constraints: The universal laws of physics (like gravity or diffusion) that all life must obey, regardless of what it's made of. This is where different life forms start to look more similar.


- Principles: At the highest level are abstract principles like evolution and learning. Chris suggests these computational or "optimization" rules are what truly define a living system.


A key idea is "convergence" – using the example of the eye. It's such a complex organ that you'd think it evolved only once. However, eyes evolved many separate times across different species. This is because the physics of light provides a clear "target", and evolution found similar solutions to the problem of seeing, even with different starting materials.



**SPONSOR MESSAGES**

—

Prolific - Quality data. From real people. For faster breakthroughs.

https://www.prolific.com/?utm_source=mlst

—

Check out NotebookLM from Google here - https://notebooklm.google.com/ - it’s really good for doing research directly from authoritative source material, minimising hallucinations.

—

cyber•Fund https://cyber.fund/?utm_source=mlst is a founder-led investment firm accelerating the cybernetic economy

Hiring a SF VC Principal: https://talent.cyber.fund/companies/cyber-fund-2/jobs/57674170-ai-investment-principal#content?utm_source=mlst

Submit investment deck: https://cyber.fund/contact?utm_source=mlst

—


Prof. Chris Kempes:

https://www.santafe.edu/people/profile/chris-kempes


TRANSCRIPT:

https://app.rescript.info/public/share/Y2cI1i0nX_-iuZitvlguHvaVLQTwPX1Y_E1EHxV0i9I


TOC:

00:00:00 - Introduction to Chris Kempes and the Santa Fe Institute

00:02:28 - The Three Cultures of Science

00:05:08 - What Makes a Good Scientific Theory?

00:06:50 - The Universal Theory of Life

00:09:40 - The Role of Material in Life

00:12:50 - A Hierarchy for Understanding Life

00:13:55 - How Life Diversifies and Converges

00:17:53 - Adaptive Processes and Defining Life

00:19:28 - Functionalism, Memes, and Phylogenies

00:22:58 - Convergence at Multiple Levels

00:25:45 - The Possibility of Simulating Life

00:28:16 - Intelligence, Parasitism, and Spectrums of Life

00:32:39 - Phase Changes in Evolution

00:36:16 - The Separation of Matter and Logic

00:37:21 - Assembly Theory and Quantifying Complexity


REFS:

Developing a predictive science of the biosphere requires the integration of scientific cultures [Kempes et al]

https://www.pnas.org/doi/10.1073/pnas.2209196121


Seeing with an extra sense (“Dangerous prediction”) [Rob Phillips]

https://www.sciencedirect.com/science/article/pii/S0960982224009035


The Multiple Paths to Multiple Life [Christopher P. Kempes & David C. Krakauer]

https://link.springer.com/article/10.1007/s00239-021-10016-2


The Information Theory of Individuality [David Krakauer et al]

https://arxiv.org/abs/1412.2447


Minds, Brains and Programs [Searle]

https://home.csulb.edu/~cwallis/382/readings/482/searle.minds.brains.programs.bbs.1980.pdf


The error threshold

https://www.sciencedirect.com/science/article/abs/pii/S0168170204003843


Assembly theory and its relationship with computational complexity [Kempes et al]

https://arxiv.org/abs/2406.12176

Show more...
2 months ago
40 minutes 59 seconds

Machine Learning Street Talk (MLST)
Google Researcher Shows Life "Emerges From Code" - Blaise Agüera y Arcas

Blaise Agüera y Arcas explores some mind-bending ideas about what intelligence and life really are—and why they might be more similar than we think (filmed at ALIFE conference, 2025 - https://2025.alife.org/).


Life and intelligence are both fundamentally computational (he says). From the very beginning, living things have been running programs. Your DNA? It's literally a computer program, and the ribosomes in your cells are tiny universal computers building you according to those instructions.


**SPONSOR MESSAGES**

—

Prolific - Quality data. From real people. For faster breakthroughs.

https://www.prolific.com/?utm_source=mlst

—

cyber•Fund https://cyber.fund/?utm_source=mlst is a founder-led investment firm accelerating the cybernetic economy

Oct SF conference - https://dagihouse.com/?utm_source=mlst - Joscha Bach keynoting(!) + OAI, Anthropic, NVDA,++

Hiring a SF VC Principal: https://talent.cyber.fund/companies/cyber-fund-2/jobs/57674170-ai-investment-principal#content?utm_source=mlst

Submit investment deck: https://cyber.fund/contact?utm_source=mlst

—


Blaise argues that there is more to evolution than random mutations (like most people think). The secret to increasing complexity is *merging* i.e. when different organisms or systems come together and combine their histories and capabilities.


Blaise describes his "BFF" experiment where random computer code spontaneously evolved into self-replicating programs, showing how purpose and complexity can emerge from pure randomness through computational processes.


https://en.wikipedia.org/wiki/Blaise_Ag%C3%BCera_y_Arcas

https://x.com/blaiseaguera?lang=en


TRANSCRIPT:

https://app.rescript.info/public/share/VX7Gktfr3_wIn4Bj7cl9StPBO1MN4R5lcJ11NE99hLg


TOC:

00:00:00 Introduction - New book "What is Intelligence?"

00:01:45 Life as computation - Von Neumann's insights

00:12:00 BFF experiment - How purpose emerges

00:26:00 Symbiogenesis and evolutionary complexity

00:40:00 Functionalism and consciousness

00:49:45 AI as part of collective human intelligence

00:57:00 Comparing AI and human cognition


REFS:

What is intelligence [Blaise Agüera y Arcas]

https://whatisintelligence.antikythera.org/ [Read free online, interactive rich media]

https://mitpress.mit.edu/9780262049955/what-is-intelligence/ [MIT Press]


Large Language Models and Emergence: A Complex Systems Perspective

https://arxiv.org/abs/2506.11135


Our first Noam Chomsky MLST interview

https://www.youtube.com/watch?v=axuGfh4UR9Q


Chance and Necessity [Jacques Monod]

https://monoskop.org/images/9/99/Monod_Jacques_Chance_and_Necessity.pdf


Wonderful Life: The Burgess Shale and the History of Nature [Stephen Jay Gould]

https://www.amazon.co.uk/Wonderful-Life-Burgess-Nature-History/dp/0099273454


The major evolutionary transitions [E Szathmáry, J M Smith]

https://wiki.santafe.edu/images/0/0e/Szathmary.MaynardSmith_1995_Nature.pdf


Don't Sleep, There Are Snakes: Life and Language in the Amazonian Jungle [Dan Everett]

https://www.amazon.com/Dont-Sleep-There-Are-Snakes/dp/0307386120


The Nature of Technology: What It Is and How It Evolves [W. Brian Arthur]

https://www.amazon.com/Nature-Technology-What-How-Evolves-ebook/dp/B002RI9W16/


The MANIAC [Benjamin Labatut]

https://www.amazon.com/MANIAC-Benjam%C3%ADn-Labatut/dp/1782279814


When We Cease to Understand the World [Benjamin Labatut]

https://www.amazon.com/When-We-Cease-Understand-World/dp/1681375664/


The Boys in the Boat [Dan Brown]

https://www.amazon.com/Boys-Boat-Americans-Berlin-Olympics/dp/0143125478


[Petter Johansson] (Split brain)

https://www.lucs.lu.se/fileadmin/user_upload/lucs/2011/01/Johansson-et-al.-2006-How-Something-Can-Be-Said-About-Telling-More-Than-We-Can-Know.pdf


If Anyone Builds It, Everyone Dies [Eliezer Yudkowsky, Nate Soares]

https://www.amazon.com/Anyone-Builds-Everyone-Dies-Superhuman/dp/0316595640


The science of cycology

https://link.springer.com/content/pdf/10.3758/bf03195929.pdf


<trunc, see YT desc for more>


Show more...
2 months ago
59 minutes 53 seconds

Machine Learning Street Talk (MLST)
The Secret Engine of AI - Prolific [Sponsored] (Sara Saab, Enzo Blindow)

We sat down with Sara Saab (VP of Product at Prolific) and Enzo Blindow (VP of Data and AI at Prolific) to explore the critical role of human evaluation in AI development and the challenges of aligning AI systems with human values. Prolific is a human annotation and orchestration platform for AI used by many of the major AI labs. This is a sponsored show in partnership with Prolific.


**SPONSOR MESSAGES**

—

cyber•Fund https://cyber.fund/?utm_source=mlst is a founder-led investment firm accelerating the cybernetic economy

Oct SF conference - https://dagihouse.com/?utm_source=mlst - Joscha Bach keynoting(!) + OAI, Anthropic, NVDA,++

Hiring a SF VC Principal: https://talent.cyber.fund/companies/cyber-fund-2/jobs/57674170-ai-investment-principal#content?utm_source=mlst

Submit investment deck: https://cyber.fund/contact?utm_source=mlst

—


While technologists want to remove humans from the loop for speed and efficiency, these non-deterministic AI systems actually require more human oversight than ever before. Prolific's approach is to put "well-treated, verified, diversely demographic humans behind an API" - making human feedback as accessible as any other infrastructure service.


When AI models like Grok 4 achieve top scores on technical benchmarks but feel awkward or problematic to use in practice, it exposes the limitations of our current evaluation methods. The guests argue that optimizing for benchmarks may actually weaken model performance in other crucial areas, like cultural sensitivity or natural conversation.


We also discuss Anthropic's research showing that frontier AI models, when given goals and access to information, independently arrived at solutions involving blackmail - without any prompting toward unethical behavior. Even more concerning, the more sophisticated the model, the more susceptible it was to this "agentic misalignment."


Enzo and Sarah present Prolific's "Humane" leaderboard as an alternative to existing benchmarking systems. By stratifying evaluations across diverse demographic groups, they reveal that different populations have vastly different experiences with the same AI models.


Looking ahead, the guests imagine a world where humans take on coaching and teaching roles for AI systems - similar to how we might correct a child or review code. This also raises important questions about working conditions and the evolution of labor in an AI-augmented world. Rather than replacing humans entirely, we may be moving toward more sophisticated forms of human-AI collaboration.


As AI tech becomes more powerful and general-purpose, the quality of human evaluation becomes more critical, not less. We need more representative evaluation frameworks that capture the messy reality of human values and cultural diversity.


Visit Prolific:

https://www.prolific.com/

Sara Saab (VP Product):

https://uk.linkedin.com/in/sarasaab


Enzo Blindow (VP Data & AI):

https://uk.linkedin.com/in/enzoblindow


TRANSCRIPT:

https://app.rescript.info/public/share/xZ31-0kJJ_xp4zFSC-bunC8-hJNkHpbm7Lg88RFcuLE


TOC:

[00:00:00] Intro & Background

[00:03:16] Human-in-the-Loop Challenges

[00:17:19] Can AIs Understand?

[00:32:02] Benchmarking & Vibes

[00:51:00] Agentic Misalignment Study

[01:03:00] Data Quality vs Quantity

[01:16:00] Future of AI Oversight


REFS:

Anthropic Agentic Misalignment

https://www.anthropic.com/research/agentic-misalignment


Value Compass

https://arxiv.org/pdf/2409.09586


Reasoning Models Don’t Always Say What They Think (Anthropic)

https://www.anthropic.com/research/reasoning-models-dont-say-think

https://assets.anthropic.com/m/71876fabef0f0ed4/original/reasoning_models_paper.pdf


Apollo research - science of evals blog post

https://www.apolloresearch.ai/blog/we-need-a-science-of-evals


Leaderboard Illusion

https://www.youtube.com/watch?v=9W_OhS38rIE MLST video


The Leaderboard Illusion [2025]

Shivalika Singh et al

https://arxiv.org/abs/2504.20879


(Truncated, full list on YT)



Show more...
2 months ago
1 hour 19 minutes 39 seconds

Machine Learning Street Talk (MLST)
AI Agents Can Code 10,000 Lines of Hacking Tools In Seconds - Dr. Ilia Shumailov (ex-GDM)

Dr. Ilia Shumailov - Former DeepMind AI Security Researcher, now building security tools for AI agents


Ever wondered what happens when AI agents start talking to each other—or worse, when they start breaking things? Ilia Shumailov spent years at DeepMind thinking about exactly these problems, and he's here to explain why securing AI is way harder than you think.


**SPONSOR MESSAGES**

—Check out notebooklm for your research project, it's really powerfulhttps://notebooklm.google.com/

—

Take the Prolific human data survey - https://www.prolific.com/humandatasurvey?utm_source=mlst and be the first to see the results and benchmark their practices against the wider community!

—

cyber•Fund https://cyber.fund/?utm_source=mlst is a founder-led investment firm accelerating the cybernetic economy

Oct SF conference - https://dagihouse.com/?utm_source=mlst - Joscha Bach keynoting(!) + OAI, Anthropic, NVDA,++

Hiring a SF VC Principal: https://talent.cyber.fund/companies/cyber-fund-2/jobs/57674170-ai-investment-principal#content?utm_source=mlst

Submit investment deck: https://cyber.fund/contact?utm_source=mlst

—


We're racing toward a world where AI agents will handle our emails, manage our finances, and interact with sensitive data 24/7. But there is a problem. These agents are nothing like human employees. They never sleep, they can touch every endpoint in your system simultaneously, and they can generate sophisticated hacking tools in seconds. Traditional security measures designed for humans simply won't work.


Dr. Ilia Shumailov

https://x.com/iliaishacked

https://iliaishacked.github.io/

https://sequrity.ai/


TRANSCRIPT:

https://app.rescript.info/public/share/dVGsk8dz9_V0J7xMlwguByBq1HXRD6i4uC5z5r7EVGM


TOC:

00:00:00 - Introduction & Trusted Third Parties via ML

00:03:45 - Background & Career Journey

00:06:42 - Safety vs Security Distinction

00:09:45 - Prompt Injection & Model Capability

00:13:00 - Agents as Worst-Case Adversaries

00:15:45 - Personal AI & CAML System Defense

00:19:30 - Agents vs Humans: Threat Modeling

00:22:30 - Calculator Analogy & Agent Behavior

00:25:00 - IMO Math Solutions & Agent Thinking

00:28:15 - Diffusion of Responsibility & Insider Threats

00:31:00 - Open Source Security Concerns

00:34:45 - Supply Chain Attacks & Trust Issues

00:39:45 - Architectural Backdoors

00:44:00 - Academic Incentives & Defense Work

00:48:30 - Semantic Censorship & Halting Problem

00:52:00 - Model Collapse: Theory & Criticism

00:59:30 - Career Advice & Ross Anderson Tribute


REFS:

Lessons from Defending Gemini Against Indirect Prompt Injections

https://arxiv.org/abs/2505.14534


Defeating Prompt Injections by Design.

Debenedetti, E., Shumailov, I., Fan, T., Hayes, J., Carlini, N., Fabian, D., Kern, C., Shi, C., Terzis, A., & Tramèr, F.

https://arxiv.org/pdf/2503.18813


Agentic Misalignment: How LLMs could be insider threats

https://www.anthropic.com/research/agentic-misalignment


STOP ANTHROPOMORPHIZING INTERMEDIATE TOKENS AS REASONING/THINKING TRACES!

Subbarao Kambhampati et al

https://arxiv.org/pdf/2504.09762


Meiklejohn, S., Blauzvern, H., Maruseac, M., Schrock, S., Simon, L., & Shumailov, I. (2025).

Machine learning models have a supply chain problem.

https://arxiv.org/abs/2505.22778


Gao, Y., Shumailov, I., & Fawaz, K. (2025).

Supply-chain attacks in machine learning frameworks.

https://openreview.net/pdf?id=EH5PZW6aCr


Apache Log4j Vulnerability Guidance

https://www.cisa.gov/news-events/news/apache-log4j-vulnerability-guidance


Bober-Irizar, M., Shumailov, I., Zhao, Y., Mullins, R., & Papernot, N. (2022).

Architectural backdoors in neural networks.

https://arxiv.org/pdf/2206.07840


Position: Fundamental Limitations of LLM Censorship Necessitate New Approaches

David Glukhov, Ilia Shumailov, ...

https://proceedings.mlr.press/v235/glukhov24a.html


AlphaEvolve MLST interview [Matej Balog, Alexander Novikov]

https://www.youtube.com/watch?v=vC9nAosXrJw

Show more...
3 months ago
1 hour 1 minute 7 seconds

Machine Learning Street Talk (MLST)
New top score on ARC-AGI-2-pub (29.4%) - Jeremy Berman

We need AI systems to synthesise new knowledge, not just compress the data they see. Jeremy Berman, is a research scientist at Reflection AI and recent winner of the ARC-AGI v2 public leaderboard.**SPONSOR MESSAGES**—Take the Prolific human data survey - https://www.prolific.com/humandatasurvey?utm_source=mlst and be the first to see the results and benchmark their practices against the wider community!—cyber•Fund https://cyber.fund/?utm_source=mlst is a founder-led investment firm accelerating the cybernetic economyOct SF conference - https://dagihouse.com/?utm_source=mlst - Joscha Bach keynoting(!) + OAI, Anthropic, NVDA,++Hiring a SF VC Principal: https://talent.cyber.fund/companies/cyber-fund-2/jobs/57674170-ai-investment-principal#content?utm_source=mlstSubmit investment deck: https://cyber.fund/contact?utm_source=mlst— Imagine trying to teach an AI to think like a human i.e. solving puzzles that are easy for us but stump even the smartest models. Jeremy's evolutionary approach—evolving natural language descriptions instead of python code like his last version—landed him at the top with about 30% accuracy on the ARCv2.We discuss why current AIs are like "stochastic parrots" that memorize but struggle to truly reason or innovate as well as big ideas like building "knowledge trees" for real understanding, the limits of neural networks versus symbolic systems, and whether we can train models to synthesize new ideas without forgetting everything else. Jeremy Berman:https://x.com/jerber888TRANSCRIPT:https://app.rescript.info/public/share/qvCioZeZJ4Q_NlR66m-hNUZnh-qWlUJcS15Wc2OGwD0TOC:Introduction and Overview [00:00:00]ARC v1 Solution [00:07:20]Evolutionary Python Approach [00:08:00]Trade-offs in Depth vs. Breadth [00:10:33]ARC v2 Improvements [00:11:45]Natural Language Shift [00:12:35]Model Thinking Enhancements [00:13:05]Neural Networks vs. Symbolism Debate [00:14:24]Turing Completeness Discussion [00:15:24]Continual Learning Challenges [00:19:12]Reasoning and Intelligence [00:29:33]Knowledge Trees and Synthesis [00:50:15]Creativity and Invention [00:56:41]Future Directions and Closing [01:02:30]REFS:Jeremy’s 2024 article on winning ARCAGI1-pubhttps://jeremyberman.substack.com/p/how-i-got-a-record-536-on-arc-agiGetting 50% (SoTA) on ARC-AGI with GPT-4o [Greenblatt]https://blog.redwoodresearch.org/p/getting-50-sota-on-arc-agi-with-gpt https://www.youtube.com/watch?v=z9j3wB1RRGA [his MLST interview]A Thousand Brains: A New Theory of Intelligence [Hawkins]https://www.amazon.com/Thousand-Brains-New-Theory-Intelligence/dp/1541675819https://www.youtube.com/watch?v=6VQILbDqaI4 [MLST interview]Francois Chollet + Mike Knoop’s labhttps://ndea.com/On the Measure of Intelligence [Chollet]https://arxiv.org/abs/1911.01547On the Biology of a Large Language Model [Anthropic]https://transformer-circuits.pub/2025/attribution-graphs/biology.html The ARChitects [won 2024 ARC-AGI-1-private]https://www.youtube.com/watch?v=mTX_sAq--zY Connectionism critique 1998 [Fodor/Pylshyn]https://uh.edu/~garson/F&P1.PDF Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation Hypothesis [Kumar/Stanley]https://arxiv.org/pdf/2505.11581 AlphaEvolve interview (also program synthesis)https://www.youtube.com/watch?v=vC9nAosXrJw ShinkaEvolve: Evolving New Algorithms with LLMs, Orders of Magnitude More Efficiently [Lange et al]https://sakana.ai/shinka-evolve/ Deep learning with Python Rev 3 [Chollet] - READ CHAPTER 19 NOW!https://deeplearningwithpython.io/

Show more...
3 months ago
1 hour 8 minutes 27 seconds

Machine Learning Street Talk (MLST)
Deep Learning is Not So Mysterious or Different - Prof. Andrew Gordon Wilson (NYU)

Professor Andrew Wilson from NYU explains why many common-sense ideas in artificial intelligence might be wrong. For decades, the rule of thumb in machine learning has been to fear complexity. The thinking goes: if your model has too many parameters (is "too complex") for the amount of data you have, it will "overfit" by essentially memorizing the data instead of learning the underlying patterns. This leads to poor performance on new, unseen data. This is known as the classic "bias-variance trade-off" i.e. a balancing act between a model that's too simple and one that's too complex.


**SPONSOR MESSAGES**

—

Tufa AI Labs is an AI research lab based in Zurich. **They are hiring ML research engineers!**

This is a once in a lifetime opportunity to work with one of the best labs in Europe

Contact Benjamin Crouzier - https://tufalabs.ai/

—

Take the Prolific human data survey - https://www.prolific.com/humandatasurvey?utm_source=mlst and be the first to see the results and benchmark their practices against the wider community!

—

cyber•Fund https://cyber.fund/?utm_source=mlst is a founder-led investment firm accelerating the cybernetic economy

Oct SF conference - https://dagihouse.com/?utm_source=mlst - Joscha Bach keynoting(!) + OAI, Anthropic, NVDA,++

Hiring a SF VC Principal: https://talent.cyber.fund/companies/cyber-fund-2/jobs/57674170-ai-investment-principal#content?utm_source=mlst

Submit investment deck: https://cyber.fund/contact?utm_source=mlst

—


Description Continued:


Professor Wilson challenges this fundamental belief (fearing complexity). He makes a few surprising points:


**Bigger Can Be Better**: massive models don't just get more flexible; they also develop a stronger "simplicity bias". So, if your model is overfitting, the solution might paradoxically be to make it even bigger.


**The "Bias-Variance Trade-off" is a Misnomer**: Wilson claims you don't actually have to trade one for the other. You can have a model that is incredibly expressive and flexible while also being strongly biased toward simple solutions. He points to the "double descent" phenomenon, where performance first gets worse as models get more complex, but then surprisingly starts getting better again.


**Honest Beliefs and Bayesian Thinking**: His core philosophy is that we should build models that honestly represent our beliefs about the world. We believe the world is complex, so our models should be expressive. But we also believe in Occam's razor—that the simplest explanation is often the best. He champions Bayesian methods, which naturally balance these two ideas through a process called marginalization, which he describes as an automatic Occam's razor.


TOC:


[00:00:00] Introduction and Thesis

[00:04:19] Challenging Conventional Wisdom

[00:11:17] The Philosophy of a Scientist-Engineer

[00:16:47] Expressiveness, Overfitting, and Bias

[00:28:15] Understanding, Compression, and Kolmogorov Complexity

[01:05:06] The Surprising Power of Generalization

[01:13:21] The Elegance of Bayesian Inference

[01:33:02] The Geometry of Learning

[01:46:28] Practical Advice and The Future of AI


Prof. Andrew Gordon Wilson:

https://x.com/andrewgwils

https://cims.nyu.edu/~andrewgw/

https://scholar.google.com/citations?user=twWX2LIAAAAJ&hl=en

https://www.youtube.com/watch?v=Aja0kZeWRy4

https://www.youtube.com/watch?v=HEp4TOrkwV4


TRANSCRIPT:

https://app.rescript.info/public/share/H4Io1Y7Rr54MM05FuZgAv4yphoukCfkqokyzSYJwCK8


Hosts:

Dr. Tim Scarfe / Dr. Keith Duggar (MIT Ph.D)


REFS:


Deep Learning is Not So Mysterious or Different [Andrew Gordon Wilson]

https://arxiv.org/abs/2503.02113


Bayesian Deep Learning and a Probabilistic Perspective of Generalization [Andrew Gordon Wilson, Pavel Izmailov]

https://arxiv.org/abs/2002.08791


Compute-Optimal LLMs Provably Generalize Better With Scale [Marc Finzi, Sanyam Kapoor, Diego Granziol, Anming Gu, Christopher De Sa, J. Zico Kolter, Andrew Gordon Wilson]

https://arxiv.org/abs/2504.15208

Show more...
3 months ago
2 hours 3 minutes 48 seconds

Machine Learning Street Talk (MLST)
Karl Friston - Why Intelligence Can't Get Too Large (Goldilocks principle)

In this episode, hosts Tim and Keith finally realize their long-held dream of sitting down with their hero, the brilliant neuroscientist Professor Karl Friston. The conversation is a fascinating and mind-bending journey into Professor Friston's life's work, the Free Energy Principle, and what it reveals about life, intelligence, and consciousness itself.


**SPONSORS**

Gemini CLI is an open-source AI agent that brings the power of Gemini directly into your terminal - https://github.com/google-gemini/gemini-cli

---

Take the Prolific human data survey - https://www.prolific.com/humandatasurvey?utm_source=mlst and be the first to see the results and benchmark their practices against the wider community!

---

cyber•Fund https://cyber.fund/?utm_source=mlst is a founder-led investment firm accelerating the cybernetic economy

Oct SF conference - https://dagihouse.com/?utm_source=mlst - Joscha Bach keynoting(!) + OAI, Anthropic, NVDA,++

Hiring a SF VC Principal: https://talent.cyber.fund/companies/cyber-fund-2/jobs/57674170-ai-investment-principal#content?utm_source=mlst

Submit investment deck: https://cyber.fund/contact?utm_source=mlst

***


They kick things off by looking back on the 20-year journey of the Free Energy Principle. Professor Friston explains it as a fundamental rule for survival: all living things, from a single cell to a human being, are constantly trying to make sense of the world and reduce unpredictability. It’s this drive to minimize surprise that allows things to exist and maintain their structure.

This leads to a bigger question: What does it truly mean to be "intelligent"? The group debates whether intelligence is everywhere, even in a virus or a plant, or if it requires a certain level of complexity.


Professor Friston introduces the idea of different "kinds" of things, suggesting that creatures like us, who can model themselves and think about the future, possess a unique and "strange" kind of agency that sets us apart.


From intelligence, the discussion naturally flows to the even trickier concept of consciousness. Is it the same as intelligence? Professor Friston argues they are different. He explains that consciousness might emerge from deep, layered self-awareness—not just acting, but understanding that you are the one causing your actions and thinking about your place in the world.


They also explore intelligence at different sizes. Is a corporation intelligent? What about the entire planet? Professor Friston suggests there might be a "Goldilocks zone" for intelligence. It doesn't seem to exist at the super-tiny atomic level or at the massive scale of planets and solar systems, but thrives in the complex middle-ground where we live.


Finally, they tackle one of the most pressing topics of our time: Can we build a truly conscious AI? Professor Friston shares his doubts about whether our current computers are capable of a feat like that. He suggests that genuine consciousness might require a different kind of "mortal" computation, where the machine's physical body and its "mind" are inseparable, much like in biological creatures.


TRANSCRIPT:

https://app.rescript.info/public/share/FZkF8BO7HMt9aFfu2_q69WGT_ZbYZ1VVkC6RtU3eeOI


TOC:

00:00:00: Introduction & Retrospective on the Free Energy Principle

00:09:34: Strange Particles, Agency, and Consciousness

00:37:45: The Scale of Intelligence: From Viruses to the Biosphere

01:01:35: Modelling, Boundaries, and Practical Application

01:21:12: Conclusion

Show more...
4 months ago
1 hour 21 minutes 39 seconds

Machine Learning Street Talk (MLST)
The Day AI Solves My Puzzles Is The Day I Worry (Prof. Cristopher Moore)

We are joined by Cristopher Moore, a professor at the Santa Fe Institute with a diverse background in physics, computer science, and machine learning.The conversation begins with Cristopher, who calls himself a "frog" explaining that he prefers to dive deep into specific, concrete problems rather than taking a high-level "bird's-eye view". They explore why current AI models, like transformers, are so surprisingly effective. Cristopher argues it's because the real world isn't random; it's full of rich structures, patterns, and hierarchies that these models can learn to exploit, even if we don't fully understand how.**SPONSORS**Take the Prolific human data survey - https://www.prolific.com/humandatasurvey?utm_source=mlst and be the first to see the results and benchmark their practices against the wider community!---Cyber•Fund https://cyber.fund/?utm_source=mlst is a founder-led investment firm accelerating the cybernetic economy.Oct SF conference - https://dagihouse.com/?utm_source=mlst - Joscha Bach keynoting(!) + OAI, Anthropic, NVDA,++Hiring a SF VC Principal: https://talent.cyber.fund/companies/cyber-fund-2/jobs/57674170-ai-investment-principal#content?utm_source=mlstSubmit investment deck: https://cyber.fund/contact?utm_source=mlst***

Cristopher Moore:

https://sites.santafe.edu/~moore/

TOC:00:00:00 - Introduction00:02:05 - Meet Christopher Moore: A Frog in the World of Science00:05:14 - The Limits of Transformers and Real-World Data00:11:19 - Intelligence as Creative Problem-Solving00:23:30 - Grounding, Meaning, and Shared Reality00:31:09 - The Nature of Creativity and Aesthetics00:44:31 - Computational Irreducibility and Universality00:53:06 - Turing Completeness, Recursion, and Intelligence01:11:26 - The Universe Through a Computational Lens01:26:45 - Algorithmic Justice and the Need for Transparency

TRANSCRIPT: https://app.rescript.info/public/share/VRe2uQSvKZOm0oIBoDsrNwt46OMCqRnShVnUF3qyoFk

Filmed at DISI (Diverse Intelligences Summer Institute)

https://disi.org/

REFS:The Nature of computation [Chris Moore]https://nature-of-computation.org/ Birds and Frogs [Freeman Dyson]https://www.ams.org/notices/200902/rtx090200212p.pdf Replica Theory [Parisi et al]https://arxiv.org/pdf/1409.2722 Janossy pooling [Fabian Fuchs]https://fabianfuchsml.github.io/equilibriumaggregation/ Cracking the cryptic [YT channel]https://www.youtube.com/c/CrackingTheCrypticSudoko Bench [Sakana]https://sakana.ai/sudoku-bench/Fractured entangled representations “phylogenetic locking in comment” [Kumar/Stanley]https://arxiv.org/pdf/2505.11581 (see our shows on this)The War Against Cliché: [Martin Amis]https://www.amazon.com/War-Against-Cliche-Reviews-1971-2000/dp/0375727167Rule 110 (CA)https://mathworld.wolfram.com/Rule150.htmlUniversality in Elementary Cellular Automata [Matt Cooke]https://wpmedia.wolfram.com/sites/13/2018/02/15-1-1.pdf Small Semi-Weakly Universal Turing Machines [Damien Woods] https://tilde.ini.uzh.ch/users/tneary/public_html/WoodsNeary-FI09.pdf COMPUTING MACHINERY AND INTELLIGENCE [Turing, 1950]https://courses.cs.umbc.edu/471/papers/turing.pdf Comment on Space Time as a causal set [Moore, 88]https://sites.santafe.edu/~moore/comment.pdf Recursion Theory on the Reals and Continuous-time Computation [Moore, 96]

Show more...
4 months ago
1 hour 34 minutes 52 seconds

Machine Learning Street Talk (MLST)
Michael Timothy Bennett: Defining Intelligence and AGI Approaches

Dr. Michael Timothy Bennett is a computer scientist who's deeply interested in understanding artificial intelligence, consciousness, and what it means to be alive. He's known for his provocative paper "What the F*** is Artificial Intelligence" which challenges conventional thinking about AI and intelligence.**SPONSOR MESSAGES***Prolific: Quality data. From real people. For faster breakthroughs.https://prolific.com/mlst?utm_campaign=98404559-MLST&utm_source=youtube&utm_medium=podcast&utm_content=mb***Michael takes us on a journey through some of the biggest questions in AI and consciousness. He starts by exploring what intelligence actually is - settling on the idea that it's about "adaptation with limited resources" (a definition from researcher Pei Wang that he particularly likes).The discussion ranges from technical AI concepts to philosophical questions about consciousness, with Michael offering fresh perspectives that challenge Silicon Valley's "just scale it up" approach to AI. He argues that true intelligence isn't just about having more parameters or data - it's about being able to adapt efficiently, like biological systems do.TOC:1. Introduction & Paper Overview [00:01:34]2. Definitions of Intelligence [00:02:54]3. Formal Models (AIXI, Active Inference) [00:07:06]4. Causality, Abstraction & Embodiment [00:10:45]5. Computational Dualism & Mortal Computation [00:25:51]6. Modern AI, AGI Progress & Benchmarks [00:31:30]7. Hybrid AI Approaches [00:35:00]8. Consciousness & The Hard Problem [00:39:35]9. The Diverse Intelligences Summer Institute (DISI) [00:53:20]10. Living Systems & Self-Organization [00:54:17]11. Closing Thoughts [01:04:24]Michaels socials:https://michaeltimothybennett.com/https://x.com/MiTiBennettTranscript:https://app.rescript.info/public/share/4jSKbcM77Sf6Zn-Ms4hda7C4krRrMcQt0qwYqiqPTPIReferences:Bennett, M.T. "What the F*** is Artificial Intelligence"https://arxiv.org/abs/2503.23923Bennett, M.T. "Are Biological Systems More Intelligent Than Artificial Intelligence?" https://arxiv.org/abs/2405.02325Bennett, M.T. PhD Thesis "How To Build Conscious Machines"https://osf.io/preprints/thesiscommons/wehmg_v1Legg, S. & Hutter, M. (2007). "Universal Intelligence: A Definition of Machine Intelligence"Wang, P. "Defining Artificial Intelligence" - on non-axiomatic reasoning systems (NARS)Chollet, F. (2019). "On the Measure of Intelligence" - introduces the ARC benchmark and developer-aware generalizationHutter, M. (2005). "Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability"Chalmers, D. "The Hard Problem of Consciousness"Descartes, R. - Cartesian dualism and the pineal gland theory (historical context)Friston, K. - Free Energy Principle and Active Inference frameworkLevin, M. - Work on collective intelligence, cancer as information isolation, and "mind blindness"Hinton, G. (2022). "The Forward-Forward Algorithm" - introduces mortal computation conceptAlexander Ororbia & Friston - Formal treatment of mortal computationSutton, R. "The Bitter Lesson" - on search and learning in AIPearl, J. "The Book of Why" - causal inference and reasoningAlternative AGI ApproachesWang, P. - NARS (Non-Axiomatic Reasoning System)Goertzel, B. - Hyperon system and modular AGI architecturesBenchmarks & EvaluationHendrycks, D. - Humanities Last Exam benchmark (mentioned re: saturation)Filmed at:Diverse Intelligences Summer Institute (DISI) https://disi.org/

Show more...
4 months ago
1 hour 5 minutes 44 seconds

Machine Learning Street Talk (MLST)
Superintelligence Strategy (Dan Hendrycks)

Deep dive with Dan Hendrycks, a leading AI safety researcher and co-author of the "Superintelligence Strategy" paper with former Google CEO Eric Schmidt and Scale AI CEO Alexandr Wang.


*** SPONSOR MESSAGES

Gemini CLI is an open-source AI agent that brings the power of Gemini directly into your terminal - https://github.com/google-gemini/gemini-cli


Prolific: Quality data. From real people. For faster breakthroughs.

https://prolific.com/mlst?utm_campaign=98404559-MLST&utm_source=youtube&utm_medium=podcast&utm_content=script-gen

***


Hendrycks argues that society is making a fundamental mistake in how it views artificial intelligence. We often compare AI to transformative but ultimately manageable technologies like electricity or the internet. He contends a far better and more realistic analogy is nuclear technology. Like nuclear power, AI has the potential for immense good, but it is also a dual-use technology that carries the risk of unprecedented catastrophe.


The Problem with an AI "Manhattan Project":


A popular idea is for the U.S. to launch a "Manhattan Project" for AI—a secret, all-out government race to build a superintelligence before rivals like China. Hendrycks argues this strategy is deeply flawed and dangerous for several reasons:


- It wouldn’t be secret. You cannot hide a massive, heat-generating data center from satellite surveillance.


- It would be destabilizing. A public race would alarm rivals, causing them to start their own desperate, corner-cutting projects, dramatically increasing global risk.


- It’s vulnerable to sabotage. An AI project can be crippled in many ways, from cyberattacks that poison its training data to physical attacks on its power plants. This is what the paper refers to as a "maiming attack."


This vulnerability leads to the paper's central concept: Mutual Assured AI Malfunction (MAIM). This is the AI-era version of the nuclear-era's Mutual Assured Destruction (MAD). In this dynamic, any nation that makes an aggressive, destabilizing bid for a world-dominating AI must expect its rivals to sabotage the project to ensure their own survival.


This deterrence, Hendrycks argues, is already the default reality we live in.


A Better Strategy: The Three Pillars

Instead of a reckless race, the paper proposes a more stable, three-part strategy modeled on Cold War principles:


- Deterrence: Acknowledge the reality of MAIM. The goal should not be to "win" the race to superintelligence, but to deter anyone from starting such a race in the first place through the credible threat of sabotage.


- Nonproliferation: Just as we work to keep fissile materials for nuclear bombs out of the hands of terrorists and rogue states, we must control the key inputs for catastrophic AI. The most critical input is advanced AI chips (GPUs). Hendrycks makes the powerful claim that building cutting-edge GPUs is now more difficult than enriching uranium, making this strategy viable.


- Competitiveness: The race between nations like the U.S. and China should not be about who builds superintelligence first. Instead, it should be about who can best use existing AI to build a stronger economy, a more effective military, and more resilient supply chains (for example, by manufacturing more chips domestically).


Dan says the stakes are high if we fail to manage this transition:


- Erosion of Control

- Intelligence Recursion

- Worthless Labor


Hendrycks maintains that while the risks are existential, the future is not set.


TOC:

1 Measuring the Beast [00:00:00]

2 Defining the Beast [00:11:34]

3 The Core Strategy [00:38:20]

4 Ideological Battlegrounds [00:53:12]

5 Mechanisms of Control [01:34:45]


TRANSCRIPT:

https://app.rescript.info/public/share/cOKcz4pWRPjh7BTIgybd7PUr_vChUaY6VQW64No8XMs


<truncated, see refs and larger description on YT version>


Show more...
4 months ago
1 hour 45 minutes 38 seconds

Machine Learning Street Talk (MLST)
Welcome! We engage in fascinating discussions with pre-eminent figures in the AI field. Our flagship show covers current affairs in AI, cognitive science, neuroscience and philosophy of mind with in-depth analysis. Our approach is unrivalled in terms of scope and rigour – we believe in intellectual diversity in AI, and we touch on all of the main ideas in the field with the hype surgically removed. MLST is run by Tim Scarfe, Ph.D (https://www.linkedin.com/in/ecsquizor/) and features regular appearances from MIT Doctor of Philosophy Keith Duggar (https://www.linkedin.com/in/dr-keith-duggar/).