What should I ask Ajeya Cotra — senior researcher at Open Philanthropy, and expert on AI timelines and safety challenges?
By Robert_Wiblin @ 2022-10-28T15:28 (+23)
Next week for The 80,000 Hours Podcast I'm interviewing Ajeya Cotra, senior researcher at Open Philanthropy, AI timelines expert, and author of Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover.
What should I ask her?
My first interview with her is here:
Some of Ajeya's work includes:
- Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
- Why AI alignment could be hard with modern deep learning
- Forecasting TAI with biological anchors
- AMA: Ajeya Cotra, researcher at Open Phil
Max Nadeau @ 2022-10-28T17:18 (+13)
Artir Kel (aka José Luis Ricón Fernández de la Puente) at Nintil wrote an essay broadly sympathetic to AI risk scenarios but doubtful of a particular step in the power-seeking stories Cotra, Gwern, and others have told. In particular, he has a hard time believing that a scaled-up version of present systems (e.g. Gato) would learn facts about itself (e.g. that it is an AI in a training process, what its trainers motivations would be, etc) and incorporate those facts into its planning (Cotra calls this "situational awareness"). Some AI safety researchers I've spoken to personally agree with Kel's skepticism on this point.
Since incorporating this sort of self-knowledge into one's plans is necessary for breaking out of training, initiating deception, etc, this seems like a pretty important disagreement. In fact, Kel claims that if he came around on this point, he would agree almost entirely with Cotra's analysis.
Can she describe in more detail what situational awareness means? Could it be demonstrated with current/nearterm models? Why does she think that Kel (and others) think it's so unlikely?
Jose Luis Ricon @ 2022-11-14T20:09 (+2)
I wonder too!
Emrik @ 2022-10-28T22:31 (+7)
Is marginal work on AI forecasting usefwl? With so much brainpower being spent on moving a single number up or down, I'd expect it to hit diminishing returns pretty fast. To what extent is forecasting a massive brain drain and people should just get to work on the object-level problems if they're sufficiently convinced? How sensitive to AI forecasting estimates are your priorities over object-level projects (as in, how many more years out would your estimate of X have to be)?
Update: I added some arguments against forecasting here, but they are very general, and I suspect they will be overwhelmed by evidence related to specific cases.
electroswing @ 2022-10-29T00:05 (+5)
What does she think policymakers should be trying to do to prevent risks from misaligned AI?
Quadratic Reciprocity @ 2022-10-28T19:42 (+5)
How would she summarise the views of various folks with <2030 median timelines (eg: Daniel Kokotajlo, the Conjecture folks), and what are her cruxes for disagreeing with them?
Erich_Grunewald @ 2022-10-28T15:49 (+4)
Nice, looking forward to hearing this!
- What does her research methodology look like?
- What's her take on Katja Grace's Counterarguments to the basic AI risk case?
Greg_Colbourn @ 2022-10-29T06:28 (+3)
If she had to direct $1B of funding (in end-point grants) toward TAI x-safety within 12 months, what would she spend it on? How about $10B, or $100B?
Greg_Colbourn @ 2022-10-29T13:36 (+2)
Maybe even that isn't thinking big enough. At some point with enough funding it could be possible to buy up and retire most existing AGI capabilities projects. At least in the West. Maybe the world would then largely follow suit (as has happened with things like e.g. global conformity on bioethics). Although on a smaller scale, there has been the precedent of curtailing the development of electric vehicles, which perhaps set things back a decade there. And EAs have discussed related things like buying up coal mines to limit climate change.
What level of spending would be needed? $1T? Would it be possible for the EA community to accumulate this much wealth in the next 5-10 years, without relying on profits from AI capabilities?
How much of an effect would it have? Could it buy us a few more years? Or would new orgs immediately fill the gaps? Would it be possible to pay off existing AI capabilities researchers in a way that they agree to a legally binding contract not to work on similar projects (at least for a set length of time)?
Greg_Colbourn @ 2022-10-29T06:24 (+3)
What would she do if she was elected President of the USA with a mandate to prevent existential catastrophe from AI?
Prometheus @ 2022-10-29T04:41 (+2)
Why is forecasting TAI with bio anchors useful? Many argue that the compute power of the human brain won’t be needed for TAI, since AI researchers are not designing systems that mimic them.