Don't leave your fingerprints on the future

By So8res @ 2022-10-08T00:35 (+93)

This is a crosspost, probably from LessWrong. Try viewing it there.

null
quinn @ 2022-10-08T17:10 (+15)

Strong endorse. I have, on the occasion of two lightning talks (Bahamas and EAGxSingapore) and a shortform post, claimed that lock-in risk obliges us to reject positive longtermism (fighting-for) and constrict ourselves to negative longtermism (fighting-against). However, I point out near the end of each lightning talk that suffering-focused views present me with an extremely difficult challenge: that preserving the future's freedom preserves the possibility of torture, that suffering abolition is a form of positive longtermism. I really struggle with the tradeoff between libertarianism/cosmopolitanism (or any other framework behind emphasizing lock-in risk) and suffering-focused views; by far, the button-offering demon that presents me with the most dread is the one offering a button that would end suffering but lock in my particular opinions and aesthetics about what flourishing is. 

No top-level post about positive and negative longtermism yet, though. 

HaydnBelfield @ 2022-10-14T14:10 (+12)

Just a quick note to say that I think planning on a pivotal act is risky and  dangerous, and we just don't yet know how feasible or infeasible "some healthy and competent worldwide collaboration steering the transition" is - more research is needed.

As I say in The Rival AI Deployment Problem: a Pre-deployment Agreement as the least-bad response,

"while it may seem unlikely at this stage, a predeployment agreement might be the least-bad option – and at least worthy of more study and reflection. In particular, more research should be done into possible clauses in a pre-deployment agreement, and into possibilities for AI development monitoring and verification."

RobBensinger @ 2022-10-15T21:42 (+9)

My reply to Critch is here, and Eliezer's is here and here.

Eventually, as compute becomes more available and AGI techniques become more efficient, we should expect that individual consumers will be able to train an AGI that destroys the world using the amount of compute on a mass-marketed personal computer. (If the world wasn't already destroyed before that.)

What's the likeliest way you expect this outcome to be prevented, or (if you don't think it ought to be prevented, or don't think it's preventable) the likeliest way you expect things to go well if this outcome isn't prevented?

No government is (AFAIK) a major player in cutting-edge AI research today; so I think the default outcome is that this continues into the future.

Capabilities-wise, I think the default outcome is that human STEM work ends up looking similar to Alpha Go and AlphaGo Master. Less than a year passed between "AI systems aren't smart enough to beat any human professional in a single standard Go game" and "humans aren't smart enough to beat any SotA Go AI in a single standard Go game". I'd expect there to be a similarly short window of time between "the first time an AI system can match top human scientists in doing open-ended reasoning about the messy physical world", and "the last time humans can match SotA AI systems in STEM work". 

Maybe we'll end up having five years, rather than one year; but this seems like a perilous hope to pin the future on, and it seems to me like most governance hopes require something more like "we have fifty years (before anyone is able to destroy the world with AI) to play around with human-level AGIs, see lots of human-level warning-shot catastrophes, give governments time to react and notice the flaws in their reaction and then adjust their reaction, build a research consensus about the hazards of AGI and about the best plan, build a policy consensus that's in line with the research consensus, etc.", not "we have five years".

I think it's very possible for humanity to avoid destruction in the more realistic five-year and one-year scenarios, but not via a slow multi-decade consensus-building procedure to develop new international institutions, alliances, and norms surrounding AGI (all while AGI tech continues to be available, continues to become more efficient, and continues to proliferate).

(That said, I appreciate your sharing your disagreement! One of the best things that can come out of Nate's posting, IMO, is people flagging assumptions that they disagree with, so we can talk about them.)

HaydnBelfield @ 2022-10-17T13:49 (+7)

Hi Rob, thanks for responding.

I agree that eventually, individuals may be able to train (or more importantly run exfiltrated models) advanced AI that is very dangerous. I expect that before that, it will be within the reach of richer, bigger groups. Today, it requires more compute/better techniques than we have available. At some point in the coming years/decades it will be within the reach of major states' budgets, then smaller states and large companies, and then smaller and smaller groups until its within the reach of individuals. That's the same process that many, many other technologies have followed. If that's right, what does that suggest we need? Agreement between the major states, then non-proliferation agreements, then regulation and surveillance banning corporate/individuals.

On governments not being major players in cutting-edge AI research today. This is certainly true. I think cyber might be a relevant analogy here. Much of the development and deployment of cyberattacks has been by the private sector (companies and contractors in the US, often criminals for some autocracies). Nevertheless, the biggest cyberattacks (Stuxnet, NotPetya, etc) are directed by the governments of major states - i.e. the P5 of US, Russia, UK, France and China. Its possible that something similar happens for AI. 

In terms of how long international agreements take, I think 50 years is a bit pessimistic. I would take arms control agreements as possible comparisons. Take 1972's nuclear and biological weapons agreements. The ideas behind deterrence were largely developed around 1960 (Schelling 1985; Adler 1992), and then made into an international agreement in 1972. It might even have happened sooner, under LBJ, had the USSR not invaded Czechoslovakia on 20th August 1968, a day before SALT was supposed to start. On biological weapons, the UK proposed the BWC in August 1968, and it was signed in 1972 as well. New START took about 2 years. So in general, bilateral arms control style agreements with monitoring and verification can be agreed in less than 5 years.

To take the nuclear 1960s analogy, we could loosely think of ourselves as being in early 1962: we've come up with the concerns if not the specific agreements, and some decision-makers and politicians are on board. We haven't yet had a major AI warning shot like the Cuban Missile Crisis (which began 60 years ago yesterday!), we haven't yet had the confidence-building measures like the 1963 Hotline Agreement, and haven't yet proposed or begun the equivalent of SALT. All that's might be to come in the next few years/decades. 

This won't be an easy project by any means, but I don't think we can yet say its completely infeasible - more research, and the attempt itself, is needed.