Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
Sports
TV & Film
Technology
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/b4/61/ed/b461ed35-bc8f-43ba-d3b7-a1ac63240c5b/mza_16911523502878814154.jpg/600x600bb.jpg
De Nederlandse Kubernetes Podcast
Ronald Kers en Jan Stomphorst
118 episodes
3 days ago
In this episode, we talk with Abdel Sghiouar and Mofi Rahman, Developer Advocates at Google and (guest) hosts of the Kubernetes Podcast from Google. Together, we dive into one central question: can you truly run LLMs reliably and at scale on Kubernetes? It quickly becomes clear that LLM workloads behave nothing like traditional web applications: GPUs are scarce, expensive, and difficult to schedule.Models are massive — some reaching 700GB — making load times, storage throughput, and caching ...
Show more...
Technology
Education,
Business,
How To
RSS
All content for De Nederlandse Kubernetes Podcast is the property of Ronald Kers en Jan Stomphorst and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
In this episode, we talk with Abdel Sghiouar and Mofi Rahman, Developer Advocates at Google and (guest) hosts of the Kubernetes Podcast from Google. Together, we dive into one central question: can you truly run LLMs reliably and at scale on Kubernetes? It quickly becomes clear that LLM workloads behave nothing like traditional web applications: GPUs are scarce, expensive, and difficult to schedule.Models are massive — some reaching 700GB — making load times, storage throughput, and caching ...
Show more...
Technology
Education,
Business,
How To
https://storage.buzzsprout.com/dritfh0zcbmrsxrqgttgdx53kpy0?.jpg
#116 Running AI on Kubernetes: From GPUs to CRO
De Nederlandse Kubernetes Podcast
42 minutes
3 weeks ago
#116 Running AI on Kubernetes: From GPUs to CRO
In this episode of De Nederlandse Kubernetes Podcast, we talk with Carlos Santana, Principal Partner Solution Architect at AWS and long-time contributor to the Kubernetes and AI communities. Carlos joins us to explore what it really takes to run AI workloads on Kubernetes, from GPU scheduling to scaling inference and training efficiently across clusters. We discuss how AI and machine learning are transforming the cloud-native ecosystem — and why orchestration is becoming just as important as ...
De Nederlandse Kubernetes Podcast
In this episode, we talk with Abdel Sghiouar and Mofi Rahman, Developer Advocates at Google and (guest) hosts of the Kubernetes Podcast from Google. Together, we dive into one central question: can you truly run LLMs reliably and at scale on Kubernetes? It quickly becomes clear that LLM workloads behave nothing like traditional web applications: GPUs are scarce, expensive, and difficult to schedule.Models are massive — some reaching 700GB — making load times, storage throughput, and caching ...