Olmo 3: Unpacking the Fully Open LLM Flow (Dolma 3, OlmoRL, & State-of-the-Art Reasoning)

https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/5b/21/b5/5b21b5ed-a4e4-61f5-6763-39cd728bb28b/mza_8940241363465430390.jpg/600x600bb.jpg

Neural intel Pod

Neuralintel.org

307 episodes

1 day ago

🧠 Neural Intel: Breaking AI News with Technical Depth Neural Intel Pod cuts through the hype to deliver fast, technical breakdowns of the biggest developments in AI. From major model releases like GPT‑5 and Claude Sonnet to leaked research and early signals, we combine breaking coverage with deep technical context, all narrated by AI for clarity and speed. Join researchers, engineers, and builders who stay ahead without the noise. 🔗 Join the community: Neuralintel.org | 📩 Advertise with us: director@neuralintel.org

Tech News

News

RSS

All content for Neural intel Pod is the property of Neuralintel.org and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.

Tech News

News

https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/42633237/42633237-1733800701818-10077ebf0384e.jpg

Olmo 3: Unpacking the Fully Open LLM Flow (Dolma 3, OlmoRL, & State-of-the-Art Reasoning)

Neural intel Pod

13 minutes 14 seconds

3 weeks ago

Olmo 3: Unpacking the Fully Open LLM Flow (Dolma 3, OlmoRL, & State-of-the-Art Reasoning)

Join us for a deep technical discussion on Olmo 3, the latest family of state-of-the-art, fully open language models developed by the Olmo Team at the Allen Institute for AI (Ai2). Targeting the specialized audience of ML insiders, this episode dissects the entire model flow—a commitment to releasing the full lifecycle, including every stage, checkpoint, datapoint, and dependency used to build the models. This unprecedented transparency enables infinite customization and advancement in open-source AI research.Olmo 3 offers models at both the 7B and 32B parameter scales. We focus on how these models were engineered to excel across a diverse set of capabilities, including long context reasoning, function calling, coding, instruction following, general chat, and knowledge recall.Key technical highlights covered include:

• The Model Lineup: We explore the Olmo 3 family, including Olmo 3 Base (Olmo-3-1025-7B, Olmo-3-1125-32B), the specialized Olmo 3 Think (trained for step-by-step reasoning and generating thinking traces), and Olmo 3 Instruct (optimized for general chat and inference efficiency). Notably, the flagship Olmo 3 Think-32B is the strongest fully open thinking model released to-date.

• The Data Pipeline (Dolma & Dolci): We detail the sophisticated data mixing methodologies, including Dolma 3 Mix (5.9T tokens for pretraining), refined by Dolma 3 Dolmino Mix during the 100B token mid-training stage to boost capabilities in code and math. Post-training utilizes the new Dolci suite, providing tailored data for Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Reinforcement Learning (RL).

• Long-Context Engineering: Learn how Olmo 3 achieves 64K context through a newly added extension stage. This process incorporates high-quality data like olmOCR Science PDFs and utilizes techniques like YaRN positional embedding extension and specialized document packing.

• Advanced Post-Training: We break down the three-stage process (SFT, DPO, RLVR) used for the Think and Instruct models. Discover the Delta Learning approach used in DPO to achieve capability gains by maximizing the contrast between chosen and rejected responses.

• OlmoRL and RL-Zero: We examine OlmoRL, the improved RL training approach that generalizes verifiable reasoning across multiple domains (math, code, instruction following, general chat) and features crucial infrastructure advances (like asynchronous training and inflight updates). Plus, we cover the fully open Olmo 3 RL-Zero setup designed for rigorous RL algorithm benchmarking from a base model.Olmo 3 Base models outperform other fully open alternatives like Stanford Marin and Apertus, while the post-trained models are highly competitive with leading open-weight systems, often achieving strong results while training on roughly six times fewer tokens than competitors like Qwen 3 32B.

Keywords: LLM, Open Source AI, Olmo 3, Ai2, Model Flow, Technical Report, Machine Learning, Deep Learning, Transformer, Long Context, Reasoning, RLHF, DPO, RLVR, OlmoRL, Dolma, Dolci, 7B, 32B, Fine-Tuning, Deduplication, Compute-Efficiency, YaRN, Base Model, Thinking Model.