All content for The Confusion Matrix is the property of Digressive Podcasts and is served directly from their servers
with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
Welcome to the confusion matrix where we have lively and candid discussions about data, data science and AI in day to day life, business and beyond.
Evals and Aliens – How model testing is not a binary affair
The Confusion Matrix
1 hour 5 minutes 14 seconds
2 days ago
Evals and Aliens – How model testing is not a binary affair
Pete and Alex examine AI model evaluation methodologies, comparing traditional machine learning metrics with the qualitative assessment challenges of large language models. They discuss the collaborative requirements between technical and business teams to establish evaluation criteria for generative AI systems, highlighting the subjective nature of testing conversational outputs versus binary classification tasks. With the help […]
The Confusion Matrix
Welcome to the confusion matrix where we have lively and candid discussions about data, data science and AI in day to day life, business and beyond.