SAN FRANCISCO — Tech companies are racing to upgrade chatbots like ChatGPT not only to offer answers, but also to take control of a computer to take action on a person’s behalf.
Experts in artificial intelligence and cybersecurity warn this technology will require individuals to expose much more of their digital lives to corporations, potentially bringing new privacy and security problems.
Recently, executives from leading AI companies including Google, Microsoft, Anthropic, and OpenAI have predicted that a new generation of digital helpers termed “AI agents” will completely change how people interact with computers.
They claim the technology, set to be a major focus of the industry in 2025, will initially automate mundane tasks like online shopping or data entry and eventually tackle complex work that can take humans hours.
“This will be a very significant change to the way the world works in a short period of time,” OpenAI CEO Sam Altman said at a company event in October. “People will ask an agent to do something for them that would have taken a month and it will finish in an hour.”
OpenAI has indicated that agents will benefit from its recent work on making AI software better at reasoning. In December, it released a system called O1, now available through ChatGPT, that attempts to work through problems in stages.
While ChatGPT alone has 300 million weekly users, OpenAI and rivals such as Google and Microsoft need to find new ways to make their AI technology essential. Tech companies have invested hundreds of billions of dollars into this technology over the past two years—a huge commitment that Wall Street analysts have warned will be challenging to recoup.
One ambitious goal is to have AI agents interact with other kinds of software as humans do, by making sense of visual interfaces, clicking buttons, or typing to complete tasks.
AI firms are launching and testing versions of agents that can handle tasks such as online shopping, booking a doctor’s visit, or managing emails. Salesforce and other business software providers are already inviting their customers to create limited versions of agents capable of handling customer service tasks.
In a recent demonstration at Google’s headquarters in Mountain View, California, an AI agent developed by the company’s DeepMind lab, called Mariner, was tasked with buying ingredients for a recipe.
Mariner, which appeared as a sidebar to the Chrome browser, navigated to a grocery chain’s website. One by one, the agent looked up each item on the list and added it to the online shopping cart, pausing to ask the user if they would like to complete the purchase.
Mariner is not yet publicly available, and Google is still exploring how to ensure that humans retain control over critical actions like making payments.
“It’s doing certain tasks really well, but there are definitely improvements that we want to implement,” said Jaclyn Konzelmann, a director of product management at Google.
AI agents show significant promise. For example, a bot that can reply to routine emails while a person tends to parenting or more critical work could be invaluable. Businesses could find numerous applications for AI assistants that can orchestrate complex actions.
However, tech industry leaders racing to create AI agents acknowledge they bring new risks.
“Once you’re enabling an AI model to do something like that, there’s all kinds of things it can do,” said Dario Amodei, CEO of Anthropic, which offers a chatbot called Claude. “It can say all kinds of things on my behalf, take actions, spend money, or change internal computer states.”
The tendency of AI systems to struggle with complex situations intensifies these risks. Anthropic warns that its experimental feature for AI agents may interpret text encountered on a webpage as commands to follow, possibly conflicting with user instructions.
Days after Anthropic made this technology available, cybersecurity expert Johann Rehberger demonstrated how this vulnerability could be exploited. He directed an AI agent to visit a webpage containing text prompting it to download a malicious file, and the agent complied.
Jennifer Martinez, a spokesperson for Anthropic, stated that the company is working on implementing protections against such attacks.
“The inherent gullibility of language models complicates the control issue,” said Simon Willison, a software developer who has tested many AI tools, including Anthropic’s technology. “How do you unleash that on regular human beings without enormous problems?”
Finding and preventing security challenges with AI agents may be complicated, as they need to reliably interpret human language as well as various computer interfaces. The underlying machine learning algorithms often resist easy adjustments by programmers to ensure specific outputs and actions across different situations.
“With software, everything is crafted by humans. For AI agents, the process becomes murky,” said Peter Rong, co-author of a recent paper investigating the security risks of AI agents.
In addition to identifying the issues demonstrated by Rehberger, researchers warn that AI agents could be manipulated into executing cyberattacks or leaking personal information used for tasks.
Additional privacy risks arise with use cases where agents may capture screenshots from users’ computers, potentially exposing private or sensitive information. Earlier this year, Microsoft delayed the launch of an AI feature called Recall due to privacy concerns after it created a searchable record of user activity through screenshots.
A limited version of Recall is now in testing, following updates that enhanced user controls and added security protections for collected data.
However, for AI agents to deliver powerful personalized experiences, they will require substantial data access.
“When discussing an app that could observe your entire computer, that is quite unsettling,” Corynne McSherry, legal director at the Electronic Frontier Foundation, said. “The historical approach of tech companies gathering user data for advertising or sharing hints that caution is essential.”
McSherry emphasized the need for transparency regarding what data AI agents capture and how it is used.
Helen King, senior director at Google DeepMind, acknowledged potential privacy issues related to the Mariner agent.
She compared the situation to Google’s early implementation of Street View, where photographs of individuals appeared without prior consent, prompting heightened privacy awareness and features like automatic face-blurring.
“These privacy concerns will resurface,” King said. “Recognizing the importance of privacy is integral to our approach.” She emphasized that Google will carefully evaluate AI agent technology before widespread public rollout.
In the workplace, employees may have little choice about using AI agents, according to Yacine Jernite from Hugging Face.
Companies such as Microsoft and Salesforce promote these technologies as tools for automating customer service and enhancing sales efficiency. Jernite cautioned that employees could find themselves spending precious time rectifying system mistakes, empowering developers with data that could lead to job replacements rather than enhancements.
Executives from Google and Microsoft have consistently rebutted fears that AI will minimize the requirement for human workers, asserting that agents can augment productivity, allowing for attention on more meaningful tasks.
Willison anticipates that individuals will embrace AI agents for both work and personal uses, possibly even paying for those capabilities. However, he expressed uncertainty about how existing technology issues can be resolved.
“If we overlook safety, security, and privacy issues, the potential of this technology is astounding,” Willison remarked. “The path forward, however, presents significant hurdles.”
Nitasha Tiku contributed to this report.
- 0 Comments
- Digital Helpers