All content for The AI Concepts Podcast is the property of Sheetal ’Shay’ Dhar and is served directly from their servers
with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
Module 2: The MLP Layer - Where Transformers Store Knowledge
The AI Concepts Podcast
7 minutes
1 week ago
Module 2: The MLP Layer - Where Transformers Store Knowledge
Shay explains where a transformer actually stores knowledge: not in attention, but in the MLP (feed-forward) layer. The episode frames the transformer block as a two-step loop: attention moves information between tokens, then the MLP transforms each token’s representation independently to inject learned knowledge.