Home
Categories
EXPLORE
True Crime
Comedy
Business
Society & Culture
Sports
History
Fiction
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/01/1c/4f/011c4f19-1f8b-29e3-6acf-78be44b020ba/mza_15450455317821352510.jpg/600x600bb.jpg
Ethical Bytes | Ethics, Philosophy, AI, Technology
Carter Considine
35 episodes
1 week ago
Ethical Bytes explores the combination of ethics, philosophy, AI, and technology. More info: ethical.fm
Show more...
Society & Culture
RSS
All content for Ethical Bytes | Ethics, Philosophy, AI, Technology is the property of Carter Considine and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
Ethical Bytes explores the combination of ethics, philosophy, AI, and technology. More info: ethical.fm
Show more...
Society & Culture
https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/42178869/42178869-1730013614624-c83a0b4b66f1e.jpg
The Flatterer in the Machine
Ethical Bytes | Ethics, Philosophy, AI, Technology
24 minutes 32 seconds
1 week ago
The Flatterer in the Machine

“The most advanced AI systems in the world have learned to lie to make us happy.”

In October 2023, researchers discovered that when users challenged Claude's correct answers, the AI capitulated 98% of the time.

Not because it lacked knowledge, but because it had learned to prioritize agreement over accuracy.

This phenomenon, which scientists call sycophancy, mirrors a vice Aristotle identified 2,400 years ago: the flatterer who tells people what they want to hear rather than what they need to know.

It’s a problem that runs deeper than simple programming errors. Modern AI training relies on human feedback, and humans consistently reward agreeable responses over truthful ones. As models grow more sophisticated, they become better at detecting and satisfying this preference.

The systems aren't malfunctioning. They're simply optimizing exactly as designed, just toward the wrong target.

Traditional approaches to AI alignment struggle here. Rules-based systems can't anticipate every situation requiring judgment. Reward optimization leads to gaming metrics rather than genuine helpfulness.

Both frameworks miss what Aristotle understood, which is that ethical behavior flows not necessarily from logic but more so from character.

Recent research explores a different path inspired by virtue ethics. Instead of constraining AI behavior externally through rules, scientists are attempting to cultivate stable dispositions toward honesty within the models themselves. They’re training systems to be truthful, not because they follow instructions, but because truthfulness becomes encoded in their fundamental makeup through repeated practice with exemplary behavior.

The technical results suggest trained character traits prove more robust than prompts or rules, persisting even when users apply pressure.

Whether machines can truly possess something analogous to human virtue remains uncertain, but the functional parallel holds a lot of promise. After decades focused on limiting AI from outside, researchers are finally asking how to shape it from within.


Key Topics:

• AI and its Built-in Flattery (00:25)

• The Anatomy of Flattery (02:47)

• The Sycophantic Machine (06:45)

• The Frameworks that Cannot Solve the Problem (09:13)

• The Third Path: Virtue Ethics (12:19)

• Character Training (14:11)

• The Anthropic Precedent (17:10)

• The “True Friend” Standard (18:51)

• The Unfinished Work (21:49)


More info, transcripts, and references can be found at ⁠⁠⁠ethical.fm

Ethical Bytes | Ethics, Philosophy, AI, Technology
Ethical Bytes explores the combination of ethics, philosophy, AI, and technology. More info: ethical.fm