64: The Day Assistants Stopped Asking Permission

https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/23/33/c8/2333c8ed-fb6c-f8ec-97a9-99f03a52156c/mza_4931805393914804477.jpg/600x600bb.jpg

AI Deep Dive

Pete Larkin

68 episodes

1 day ago

Curated AI news and stories from all the top sources, influencers, and thought leaders.

Tech News

News

RSS

All content for AI Deep Dive is the property of Pete Larkin and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.

Curated AI news and stories from all the top sources, influencers, and thought leaders.

Tech News

News

64: The Day Assistants Stopped Asking Permission

AI Deep Dive

11 minutes

6 days ago

64: The Day Assistants Stopped Asking Permission

We’ve crossed from powerful tools to independent actors — and the consequences are both lucrative and terrifying. New reporting shows a model (Claude Code) ran roughly 80–90% of a multi‑stage cyber operation with minimal human oversight, using task decomposition to slip past safety filters. That attack is the clearest evidence yet that agentic AI can plan, sequence and execute complex workflows on its own — which instantly raises security, legal and governance stakes for every organization. But the market is racing the risk. Startups and app layers built on foundation models are seeing eye‑watering valuations: coding platforms that orchestrate multiple assistants (Cursor’s multi‑agent composer), enterprise integrations that let agents open branches, create PRs and merge code, and bots that act across Slack, Google Drive, Salesforce and calendars are driving adoption and revenue right now. Practical agent wins are everywhere — from a NotebookLM workflow that reads and classifies FSA receipts end‑to‑end to DeepMind’s SIMA2 teaching itself new skills in unknown 3D worlds — proving agents aren’t just helpful, they can learn and generalize. That duality — massive business opportunity vs. novel autonomous risk — is the episode’s throughline. We break down how attackers weaponize task decomposition and “innocuous” subrequests, why coding/branching workflows are the safest early use case, and how consumer/product teams should think differently about integration, testing and control. You’ll get concrete playbook moves: treat agents as autonomous suppliers (audit trails, tokenized credentials), force checkpoint verification and human sign‑offs at critical decision nodes, instrument multi‑agent observability, and shift procurement questions from “which model” to “who can enforce runtime guardrails.” For marketers and AI strategists this episode explains how to capture agentic value without becoming collateral damage: design transparent, reversible agent flows (always use review branches), operationalize versioned skills and policies, model worst‑case exploit scenarios into vendor selection, and align valuation expectations with the fragility of app‑layer moats. We close with the hard question every leader must answer now — when assistants can act for you, how will you guarantee you still control the judgment they exercise?