In this episode, we talk with Abdel Sghiouar and Mofi Rahman, Developer Advocates at Google and (guest) hosts of the Kubernetes Podcast from Google. Together, we dive into one central question: can you truly run LLMs reliably and at scale on Kubernetes? It quickly becomes clear that LLM workloads behave nothing like traditional web applications: GPUs are scarce, expensive, and difficult to schedule.Models are massive β some reaching 700GB β making load times, storage throughput, and caching ...
All content for De Nederlandse Kubernetes Podcast is the property of Ronald Kers en Jan Stomphorst and is served directly from their servers
with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
In this episode, we talk with Abdel Sghiouar and Mofi Rahman, Developer Advocates at Google and (guest) hosts of the Kubernetes Podcast from Google. Together, we dive into one central question: can you truly run LLMs reliably and at scale on Kubernetes? It quickly becomes clear that LLM workloads behave nothing like traditional web applications: GPUs are scarce, expensive, and difficult to schedule.Models are massive β some reaching 700GB β making load times, storage throughput, and caching ...
#103 Van Datacenter tot Dashboard: Open Source in Beweging
De Nederlandse Kubernetes Podcast
35 minutes
4 months ago
#103 Van Datacenter tot Dashboard: Open Source in Beweging
π De derde aflevering uit de speciale reeks van onze 100e aflevering livestream In deze aflevering spreken Ronald Kers en Jan Stomphorst met Marcel Timmer, Country Manager Nederland bij Red Hat, live vanaf onze 8 uur durende jubileum-uitzending. We duiken in het toenemende belang van open source niet als ideologie, maar als strategische keuze. Marcel legt uit waarom overheden en grote organisaties steeds vaker kiezen voor open source om onafhankelijker, duurzamer en innovatiever te worden. Hi...
De Nederlandse Kubernetes Podcast
In this episode, we talk with Abdel Sghiouar and Mofi Rahman, Developer Advocates at Google and (guest) hosts of the Kubernetes Podcast from Google. Together, we dive into one central question: can you truly run LLMs reliably and at scale on Kubernetes? It quickly becomes clear that LLM workloads behave nothing like traditional web applications: GPUs are scarce, expensive, and difficult to schedule.Models are massive β some reaching 700GB β making load times, storage throughput, and caching ...