Home
Categories
EXPLORE
True Crime
Comedy
Business
Society & Culture
Technology
History
Health & Fitness
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/f2/56/51/f256516c-7ca0-a1e0-095d-98b42a505a34/mza_2950839120930297173.jpg/600x600bb.jpg
Best AI papers explained
Enoch H. Kang
602 episodes
11 hours ago
Cut through the noise. We curate and break down the most important AI papers so you don’t have to.
Show more...
Technology
RSS
All content for Best AI papers explained is the property of Enoch H. Kang and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
Cut through the noise. We curate and break down the most important AI papers so you don’t have to.
Show more...
Technology
https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_episode/43252366/43252366-1766901428832-13ce5fbfb9b3.jpg
Learning to reason in LLMs by expectation maximization
Best AI papers explained
13 minutes 53 seconds
1 week ago
Learning to reason in LLMs by expectation maximization

This research formalizes the process of reasoning in large language models as a latent variable model, utilizing the expectation-maximization (EM) algorithm to improve performance. The authors demonstrate that training a model to generate intermediate rationales before answering is mathematically equivalent to reward-weighted fine-tuning using binary correctness as a signal. A central focus of the study is the sampling distribution used to create these rationales, comparing methods like rejection sampling and the self-taught reasoner (STaR). The paper introduces prompt posterior sampling (PPS), a technique that conditions the model on the correct answer during training to generate more effective reasoning traces. Experiments across multiple benchmarks show that PPS consistently outperforms existing methods by producing more concise and accurate rationales. Ultimately, the work highlights that high-quality rationale generation is just as critical to model improvement as the underlying optimization algorithms.

Best AI papers explained
Cut through the noise. We curate and break down the most important AI papers so you don’t have to.