Google DeepMind Sounds Alarm on AI Agent Risks
A $10 million funding pot has been set aside by Google DeepMind and its partners to study the potential dangers of millions of different AI agents interacting with each other online. The concern is that as more and more AI tools for businesses are deployed, we could hit a tipping point where imagined scenarios become real. Rohin Shah, who directs the company’s AGI safety and alignment research at Google DeepMind, says this mass-market arrival creates a whole new class of risk.
Shah notes that current institutions can accomplish things no individual human can, just like with humanity. He thinks we have a few more months to go before agents are deployed throughout the economy in numbers that make potential risks a real concern. Shah wants to get ahead of this moment and is teaming up with other organizations to address these concerns.
The $10 million funding pot will be used by researchers to study the behavior of multi-agent systems and come up with ways to prevent unsafe scenarios. Joining Google DeepMind are Schmidt Sciences, ARIA, the Cooperative AI foundation, and Google’s charitable arm Google.org. The aim is to kickstart research outside of tech companies.
Shah believes that academia can look really far into the future and do work that isn’t top of mind at industry labs. He says there just isn't a field of research for multi-agent safety yet, and they would like there to be one. Shah thinks we need to get ahead of potential risks before agents are deployed in large numbers.
The possibilities that Shah has in mind mostly boil down to supercharged versions of bad things that happen on the internet already: scams, prompt injections (where an AI agent is fed malicious instructions), and other forms of cyberattack. We look at what humans do now and ask what the agent version of that would be, says Shah.
Researchers want to run realistic simulations by dropping AI agents into sandboxes and studying their behavior. You can’t predict what’s going to happen by studying single agents or even small groups in isolation. The complexity comes from having huge numbers of interactions at once.
Some researchers have argued that artificial general intelligence could come not from a single super-smart model but from an agent hivemind, where the capabilities of the whole add up to more than the sum of its parts. Google DeepMind is not alone in warning about these risks; Anthropic has published guidelines for deploying AI agents based on zero trust principles.
Refael Angel, cofounder and CTO of Akeyless, agrees that understanding new risks introduced by agent-based systems is crucial. He welcomes the funding call but cautions that safety researchers can overlook boring problems in favor of more exotic hypothetical ones.