Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
History
Sports
Technology
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/8e/a3/60/8ea3605a-62cc-2e99-7212-c958c89bc059/mza_5922033384984611785.jpg/600x600bb.jpg
Medical Attention
Medical Attention
18 episodes
1 month ago
Show more...
Medicine
Technology,
Health & Fitness
RSS
All content for Medical Attention is the property of Medical Attention and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
Show more...
Medicine
Technology,
Health & Fitness
https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/8e/a3/60/8ea3605a-62cc-2e99-7212-c958c89bc059/mza_5922033384984611785.jpg/600x600bb.jpg
Ep.10 Are benchmarks broken?
Medical Attention
56 minutes 53 seconds
6 months ago
Ep.10 Are benchmarks broken?
In this episode, we’re lucky to be joined by Alexandre Sallinen and Tony O’Halloran from the Laboratory for Intelligent Global Health & Humanitarian Response Technologies to discuss how large language models are assessed, including their Massive Open Online Validation & Evaluation (MOOVE) initiative. 0:25 - Technical wrap: what are agents? 13:20 - What are benchmarks? 18:20 - Automated evaluation 20:10 - Benchmarks 37:45 - Human feedback 44:50 - LLM as judge Read more about the projects we discuss here: Meditron Learn about the MOOVE or contact our team if you'd like to be involved Listen to the LiGHTCAST including their recent excellent outline of the HealthBench paper More details in the show notes on our website. Episodes | Bluesky | info@medicalattention.ai
Medical Attention