All content for The AI Concepts Podcast is the property of Sheetal ’Shay’ Dhar and is served directly from their servers
with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
Module 2: The Encoder (BERT) vs. The Decoder (GPT)
The AI Concepts Podcast
8 minutes
6 days ago
Module 2: The Encoder (BERT) vs. The Decoder (GPT)
Shay breaks down the encoder vs decoder split in transformers: encoders (BERT) read the full text with bidirectional attention to understand meaning, while decoders (GPT) generate text one token at a time using causal attention.
She ties the architecture to training (masked-word prediction vs next-token prediction), explains why decoder-only models dominate today (they can both interpret prompts and generate efficiently with KV caching), and previews the next episode on the MLP layer, where most learned knowledge lives.