
Suppose an AI “went rogue”.
Couldn’t we just switch it off?
How would it keep itself running without human help or an army of robots?
And why would AI necessarily be evil, rather than kind?
I put these questions to Zershaaneh Qureshi, a researcher at 80,000hrs.
Her article “Risks from power-seeking AI systems” is *the* best introduction to the debate on whether AI may one day be an extinction level risk.
What struck me most was this fantastic analogy:
“You could be a scholar of the antebellum south… You'll know everything about why slave owners believe that they were justified in owning slaves. But that definitely doesn't mean that you're going to think yourself that slavery is justifiable.”
This really drives home the fact that even if we manage to build AIs that understand human values, that doesn’t mean that they will adopt those values as their own.
Timestamps:
06:49 - Is Talk of AI Extinction Just Hype From AI Companies?
18:08 - Will AI Always Be Just a Tool?
26:26 - Can We Just Switch It Off If It “Goes Rogue”?
33:52 - The Challenge of Instilling the Right Goals
46:38 - Specification Gaming and Goal Misgeneralization
53:48 - Instrumental Goals: Self-Preservation and Power-Seeking
1:01:57 - Situational Awareness: Do AIs Need to Be Conscious?
1:08:53 - Why Would We Deploy Something This Dangerous?
1:11:48 - The Deception Problem: AIs Could Hide Their True Intentions
1:20:53 - Could AI Actually Take Over the Physical World?
1:36:26 - Have We Argued Ourselves to an Absurd Conclusion?