Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
Sports
TV & Film
Technology
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/92/f0/ad/92f0adf4-2b10-a63c-bc79-1889b710b139/mza_6601485165628379978.jpg/600x600bb.jpg
AI: post transformers
mcgrof
316 episodes
1 day ago
The transformer architecture revolutionized the world of Neural Networks. It was a springboard for what we know today as modern artificial intelligence. This podcast focuses on modern state of the art research paper reviews starting from the transformer and on.
Show more...
Technology
RSS
All content for AI: post transformers is the property of mcgrof and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
The transformer architecture revolutionized the world of Neural Networks. It was a springboard for what we know today as modern artificial intelligence. This podcast focuses on modern state of the art research paper reviews starting from the transformer and on.
Show more...
Technology
https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/44199026/44199026-1754490757264-4f84f1d34e94a.jpg
AMD: Instella: Fully Open Language Models with Stellar Performance
AI: post transformers
10 minutes 15 seconds
4 days ago
AMD: Instella: Fully Open Language Models with Stellar Performance

The November 13, 2025 paper by AMD introducs **Instella**, a new family of **fully open-source** three-billion-parameter large language models (LLMs) developed by AMD and powered by their Instinct MI300X GPUs. The central focus is on advancing transparency and reproducibility in LLMs by releasing not only the model weights but also the **complete training pipeline, datasets, and optimization details**. Instella achieves **state-of-the-art performance** among fully open models of its size, remaining competitive with leading open-weight counterparts despite using fewer pre-training tokens. The family includes specialized variants: **Instella-Long**, which supports a 128K token context length, and **Instella-Math**, a reasoning-centric model enhanced through specialized supervised fine-tuning and reinforcement learning. The document details the two-stage pre-training, post-training, and specific methods used to create the Long and Math versions, demonstrating that **openness does not compromise performance**.


Source:

https://arxiv.org/pdf/2511.10628

AI: post transformers
The transformer architecture revolutionized the world of Neural Networks. It was a springboard for what we know today as modern artificial intelligence. This podcast focuses on modern state of the art research paper reviews starting from the transformer and on.