Some mistakes in thinking about AGI evolution and control

By Remmelt @ 2025-08-01T08:08 (+7)

tl;dr A short list of mistakes made by researchers considering evolution and AGI control. They're in a long post, but hard to pick out. So I put them into bullets.

1. Conflating run-time environments

It is useful to think about evolution as if it were an algorithm running on a computer. In practice though, it is not.
Evolution is an algorithm expressed through all the interactions of the AGI components, as maintained/reproduced over time, with their environment.
A control algorithm is expressed by just the digital processing/hardware mechanics.
Put simply: one algorithm is implemented by concrete physics, and the other by just the higher-level computations.
Evolutionary feedback is comprehensive – running through a much greater span of signals than the control algorithm. So even if evolution seems 'dumber' than the control algorithm, that doesn't mean that control can 'beat' evolution everywhere.

(read more)

2. Simplifying the selection process

Some imagined evolution being dumber than it is. They simplified it to a brute-force process selecting for mutations spreading vertically to next generations. Or they treated it as some kind of selection for selfishness amongst competing agents.

(read more)

3. Equivocating instances of evolution

Some expected machine evolution ("robot parts re-produced by robots") to be slow because biological evolution has been slow.
This overlooks how code spreads much faster from machine to machine.

(read more)

4. Simplifying control to constraining goals

Researchers described control as the ability to make some AI pursue (or not pursue) certain goals. E.g. can you set value parameters right to make AGI shut itself down? Or can you constrain it pre-AGI so it can't do certain things it might want to do?
The term "control" has been repurposed to mean something very narrow. All the sensors, actuators, and concrete physical dynamics are abstracted away. This new use of the term "control" is different from how it has been used in established fields.
For a control engineer, who you know actually controls stuff, this seems confusing. They have a grounded notion of control: a machine computes inputs from sensors into outputs to actuators, such to constrain effects (state changes) in the world.
By simplifying the meaning of "control" and "evolution", alignment researchers can end up ignoring actual feedback dynamics involving physical effects.
The control feedback loop would have to constrain the evolutionary feedback loop. Detection and correction through sequential machine computation must contain all the configurations stored inside the machine parts that are propagating effects simultaneously across the environment, as feeding back into code changes.

(read more)