Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
Sports
TV & Film
Technology
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/92/f0/ad/92f0adf4-2b10-a63c-bc79-1889b710b139/mza_6601485165628379978.jpg/600x600bb.jpg
AI: post transformers
mcgrof
316 episodes
1 day ago
The transformer architecture revolutionized the world of Neural Networks. It was a springboard for what we know today as modern artificial intelligence. This podcast focuses on modern state of the art research paper reviews starting from the transformer and on.
Show more...
Technology
RSS
All content for AI: post transformers is the property of mcgrof and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
The transformer architecture revolutionized the world of Neural Networks. It was a springboard for what we know today as modern artificial intelligence. This podcast focuses on modern state of the art research paper reviews starting from the transformer and on.
Show more...
Technology
https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/44199026/44199026-1754490757264-4f84f1d34e94a.jpg
A Framework for LLM Application Safety Evaluation
AI: post transformers
15 minutes 52 seconds
1 week ago
A Framework for LLM Application Safety Evaluation

The July 13, 2025 paper " Measuring What Matters: A Framework for Evaluating Safety Risks in Real-World LLM Applications" introduces a practical **framework for evaluating safety risks** in real-world Large Language Model (LLM) applications, arguing that current methods focusing only on foundation models are inadequate. This framework consists of two main parts: **principles for developing customized safety risk taxonomies** and **practices for evaluating these risks** within the application itself, which often includes components like system prompts and guardrails. It emphasizes the need for organizations to **contextualize general risks** and create taxonomies that are practical and specific to their operational context, as demonstrated by a case study from a government agency. The document then outlines a **safety testing pipeline** that involves curating meaningful and diverse adversarial prompts, running automated black-box tests, and evaluating model responses, particularly focusing on the use of refusals as a measure of safety.


Source:

July 13, 2025

Measuring What Matters: A Framework for Evaluating Safety Risks

in Real-World LLM Applications

https://arxiv.org/pdf/2507.09820

AI: post transformers
The transformer architecture revolutionized the world of Neural Networks. It was a springboard for what we know today as modern artificial intelligence. This podcast focuses on modern state of the art research paper reviews starting from the transformer and on.