AI may pursue goals

By Vishakha Agrawal, Algon @ 2025-05-28T12:04 (+2)

Context: This is a linkpost for https://aisafety.info/questions/NM3J/5:-AI-may-pursue-goals

This is an article in the new intro to AI safety series from AISafety.info. We'd appreciate any feedback. The most up-to-date version of this article is on our website.

Suppose that, as argued previously, in the next few decades we’ll have superintelligent systems. What role will they play?

One way to imagine these systems is purely as powerful and versatile tools, similar to most current systems. They could take broad directions from humans about what actions to take or what questions to answer, and cleverly fill in the details.

But another way is as agents, operating autonomously in the world. They could have their own goals — some kinds of futures they seek out over other futures — and take whatever actions will most likely lead to those futures, adapting as circumstances change.

As long as AIs are tools, they can be used for good or ill, like all technologies. They can radically increase the scope of the problems humans can solve and create.

But it’s unlikely that they’ll remain only tools, because:

A good planning tool can easily be turned into an agent. Just tell it: “repeatedly come up with actions that would make goal X more likely, and execute those actions”. People keep building software frameworks for doing this, and to the extent they succeed, better tools will result in better agents.
When a planning tool is sufficiently more competent than humans, keeping humans in the loop to direct its activities will just get in the way. It will make the system less efficient (as we’re already seeing for some medical tasks), less profitable, and less able to compete.
There may be some tasks that highly intelligent agents can do and tools just can’t, like implementing hard new research programs.
Increasingly agent-like systems are already being created — see, for example, OpenAI’s Operator, which types, clicks, and scrolls in a web browser, and Anthropic’s “computer use” feature for its chatbot Claude.
Even if most AIs remain non-agentic, some people will create agents for various reasons of their own — to be the creator of a new species, or for the heck of it.

If we’re going to build AI systems that pursue goals, it would be good if those goals matched ours. It’s not clear if we’ll succeed at making that the case.

AI may pursue goals

Related