AI governance and strategy: a list of research agendas and work that could be done.

By Nathan_Barnard, Erin Robertson @ 2024-03-12T11:21 (+30)

AI governance and strategy: a list of research agendas and work that could be done

This document was written by Nathan Barnard and Erin Robertson.

We have compiled a list of research agendas in AI governance, and we’ve written some possible questions that people could work on. Each section contains an explanation for why this theme might be relevant for existential risk and longtermist focussed governance, followed by a short description of past work. We propose some questions for each theme, but we prioritise clarity over completeness.

The content is focussed on questions which seem most important to Nathan personally, with a focus on questions which seem most useful on the margin. We have drawn often on other people’s lists, in an attempt to represent a more consensus view. Neither Erin nor Nathan have ever held AI governance research or implementation positions, and Nathan has been an independent researcher for less than a year.

A theme throughout these questions is that Nathan thinks it would be useful to try to get more high quality empirical work. A good example is this paper on policy persistence and policy windows from Freitas-Groff that has credible causal estimates of how persistent policy is, which he thinks is a really important question for prioritisation of different interventions in AI governance.


This document is divided into topics, each includes:

  1. A brief discussion on theory of change: why might this work be useful?
  2. Examples of past work in this domain: who’s working on this?
  3. 2-4 questions which people can answer with some description.

Topics:

AI regulation and other standard tools

Compute governance

Corporate governance

International governance

Misuse

Evals

China

Information security

Strategy and forecasting

Post TAI/ASI/AGI governance

 

AI regulation and other standard tools

Theory of change

Historically, government regulation has been successful at reducing accident rates in other potentially dangerous industries - for instance air travel, nuclear power plants, finance and pharmaceuticals. It’s plausible that similar regulatory action could reduce risks from powerful AI systems.

Past work

A dominant paradigm right now is applying the standard tools of technology regulation - and other non-regulatory means of reducing harm from novel and established technologies - to AI. This paradigm seems particularly important right now because of the extensive interest - and action - from governments on AI. Specifically, there was a recent Biden Executive order (EO) on AI instructing the executive branch[1] to take various regulatory and quasi-regulatory actions, on AI. The EU AI act has been passed, but there are now many open questions on how the act will be enacted, and the UK is, in various ways, creating its AI regulatory policy. Thus far there has been a lot of work looking at case studies of particular regulatory regimes, and work looking deeply into the mechanics of US government and how this could matter for AI regulation.

Questions

There are of course reasons why this statistical work hasn’t been done - it’s hard to get good data on these questions, and it’ll be difficult to credibly estimate the causal effect of these interventions because it’s hard to do natural experiments. Nevertheless, we think it’s worth some people trying hard to make progress on these questions - without this kind of statistical work we should be extremely uncertain about how useful standards that don’t have the force of law will be in particular.

UK specific questions

US specific questions

There are many, many questions on the specifics of US interventions and lots of work being done on them. In listing these questions I’ll try to avoid duplication of work that’s already been done or is in the process of being done.

EU specific questions

We are not particularly plugged in to what's happening in the EU, so take these questions with a big pinch of salt. This website has very useful summaries of the various parts of the EU AI act.

Compute governance 

Theory of change

Much compute governance work hinges on the assumption that access to compute (often GPUs) is crucial for development of frontier AI systems. Policymakers may be able to influence the pace and direction of development by controlling or tracking these resources. A good example of this is US export controls on chips to China.

We expect that the role of chips may shift in the coming years, and they may be less clearly a bottleneck to development. It may be that the bottleneck shifts to algorithmic progress or financial constraints of firms. We expect that work imagining possible failure modes of compute governance, and possible alternative constraints on progress will be helpful, since work of this kind is neglected. It's worth noting that it's only recently that training costs from computation have reached the level to put training models out of reach for semi-large firms, but prior to this there were only a small number of leading labs. This implies that something other than compute is the main constraint on the ability of labs to produce frontier models, which this paper lends some support to. This implies that it’s plausible that theories of change that rely on controlling who has access to leading node AI chips might not be very effective.

Past work

Some examples of past work include;

This recent paper on compute governance has an appendix with broad research directions at the end, and we encourage interested readers to draw from it for research ideas focused on advancing the compute governance agenda.  

Questions

These questions are quite idiosyncratic and focused on examining the failure modes of compute governance.

 

Corporate governance

Corporate governance refers to the policies, procedures, and structures by which an organisation is controlled, aiming to align the organisation's objectives with various stakeholders including shareholders, employees, and society.

Corporate governance allows individuals and leadership within a company to be held accountable for their decisions and the effects they have.

Theory of change

There are a few reasons why corporate governance can be a valuable alongside good regulation:

Past work

This research agenda from Matt Wearden is a great place to look for questions. The questions we mention here are focussed on trying to understand the track record of corporate governance.

Responsible scaling policies (RSPs) are probably the most important corporate governance and risk management tool being used at AI labs. Responsible scaling policies are the practices that firms commit to undertake to ensure that they can appropriately manage the risks associated with larger models.

I (Nathan) personally think that RSPs are useful for three reasons:

Some questions that might be important around RSPs are:

Industry led standards bodies are common across sectors, for instance finance has quite a lot of these self-regulatory bodies (SRB). We aren’t sure if these are effective at actually leading to lower accident rates in the relevant industries, and if so what the effect size, particularly in comparison to regulation. It could be really useful for someone to do a case study on a plausible case in which SRB reduced accidents in an industry and what the mechanism for this was.

 

International governance

Theory of change

It seems like there are two worlds where international governance is important.

  1. Firms are able to go to jurisdictions with laxer AI regulation, leading to a race to the bottom between jurisdictions.
  2. States become interested in developing powerful AI systems (along the lines of space race) and this leads to dangerous racing.

In both of these cases international agreements could improve coordination between jurisdictions and states to prevent competition that’s harmful to AI safety.

Past work

Most of the academic work on international governance has been aimed at the first theory of change.

Trager et al propose an international governance regime for civilian AI with a similar structure to international agreements around the verification of the rules of jurisdictions, similar to international agreements on aeroplane safety. Baker looks at nuclear arms control verification agreements as a model for an AI treaty. There has also been some excitement about CERN for AI, as this EA forum post explains, but little more formal work investigating the idea.

There has also been some work on racing between states, for instance this paper from Stafford et al.

Questions

Nathan is sceptical that international agreements on AI will matter very much and most of the questions are about investigating whether international agreements could solve important problems and could feasibly be strong enough to solve these important problems.  

Misuse 

Nathan isn’t convinced there should be much focus on misuse risks from communities worried about existential risks from AI for two reasons:

We would be particularly interested in empirical work that tries to clarify how likely x-risk from misuse is. Some work in this vein that is extremely useful is this report from the forecasting research institute on how likely superforecasters think various forms of x-risk are, this EA forum post that looks at the base rates of terrorist attacks, and this report from RAND on how useful LLMs are for bioweapon production.

Some theoretical work that has really influenced Nathan’s thinking here is this paper from Aschenbrenner modelling how x-risk changes with economic growth. The core insight of the paper is that, even if economic growth initially increases x-risk due to new technologies, as societies get richer they get more willing to spend money on safety enhancing technologies, which can be used to force down x-risk.

Questions

Some empirical work that we think would be helpful here:

Evals

Theory of change

Evals are a tool for assessing whether AI systems pose threats by trying to elicit potentially dangerous capabilities and misalignment from AI systems. This is a new field and there are many technical questions to tackle in it. The interested reader is encouraged to read this post on developing a science of evals from Apollo, a new organisation focused on evals.

The governance questions for evals are how evals fit into a broader governance strategy. See this paper from Apollo and this introduction to METR’s work. Evals also play a central part in the UK government's AI regulation strategy. See box 5 of the UK government's recent white paper for questions the UK government has, many of which relate to evals.

 

Questions

Some particular questions we are interested in are:

China

Theory of change

China questions are some of the most crucial strategic questions on AI. There seem to be two big ways in which China questions matter:

  1. How likely is it that a Chinese lab develops TAI that causes an existential catastrophe? Does this mean that we should be more reluctant to adopt measures that slow down AI in the US and allies?
  2. How likely is there to be an AI arms race between the US and China?

There are three sub questions to the first question that I’m really interested in:

  1. How likely is it that a Chinese lab is able to develop TAI before a US lab?
  2. What alignment measures are Chinese labs likely to adopt?
  3. How long should we expect it to take Chinese labs to catch up to US labs once a US lab has developed TAI?

Crucial context to here is the export controls adopted by the Biden administration in 2022, and updated in 2023, which aim to maximise the distance between leading node production in the US and allies and leading node production in China, combined with a more narrow aim of specifically restricting the technology that the Chinese military has access to.

Past work

There’s lots of great work both on the export controls, the Chinese AI sector, and Chinese semiconductor manufacturing capabilities. Interested readers are encouraged to take part in the forecasting tournament on Chinese semiconductor manufacturing capabilities.

Questions

 

Information security 

Theory of change

It seems like there are three theories of change for why infosec could matter a lot

  1. If firms are confident that they won’t have important technical information stolen they’ll race less against other firms
  2. Preventing non-state actors from gaining access to model weights might be really important for preventing misuse
  3. Preventing China from gaining access to model weights and or other technical information might be important for maintaining an AI lead for the US and allies

All of these theories of change seem plausible, but we haven’t seen any work that has really tried to test these theories of change using case studies or historical data and it would be interesting to see this sort of work.

There’s some interesting work to be done on non-infosec ways of deterring cyberattacks. It may also turn out that AI makes cyberattacks very easy to conduct technically, so the way to deter cyberattacks is with very aggressive reprisals against groups found to be conducting cyberattacks, combined with an international extradition treaty for cybercriminals.

Past work

Questions

All of these questions will be social science questions rather than technical questions - this is not all meant to imply that technical infosec questions aren’t important, just that we are completely unqualified to write interesting technical infosec questions.

Strategy and forecasting

Theory of change

Anticipating the speed at which developments will occur, and understanding the levers, is likely very helpful for informing high-level decision making.

Past work

There’s a risk with strategy and forecasting that it’s easy to be vague or use unscientific methodology, which is why recent commentary has suggested it's not a good theme for junior researchers to work on. There’s some merit to this view, and we’d encourage junior researchers to try especially hard to seek out empirical or otherwise solid methodology if they’d like to make progress on this theme.

Epoch is an AI forecasting organisation which focuses on compute. Their work is excellent because they focus on empirical results or on extending standard economic theory. Other strategy work with solid theoretical grounding includes Tom Davidson’s takeoff speed report, Halperin et al’s work on using interest rates to forecast AI, and Cotra’s bio anchors report.

Lots of the strategy work thus far - on AI timelines and AI takeoff speeds - is compute centric. This means that a core assumption of much of this work is that AI progress can be converted into a common currency of compute - the assumption here is that if you throw enough compute at today's data and algorithms you can get TAI.

Recently there’s been quite a lot of focus on work on the economic and scientific impact of LLMs, for instance see this post and this post from Open Philanthropy calling for this kind of work.

Questions

Post TAI/ASI/AGI governance

Theory of change

Lots of people think that transformative AI is coming in the next few decades. Some people have defined this by “AGI”: an AI that can do everything a human can do but better. Some have defined it in terms of “TAI”: AI which significantly changes the economic growth rate such that global GDP grows X% each year, or so that scientific developments occur X% quicker. These changes may be abrupt, and may completely change the world in ways we can’t predict. Some work has been done to anticipate these changes, and to avert the worst outcomes. It's becoming increasingly possible to do useful work under this theme, as some specific avenues for productive work have emerged. The hope is that anticipating the changes and the worst outcomes will help us have the appropriate mechanisms in place when things start getting weird.

Past work

This paper by Shulman and Bostrom on issues with digital minds is excellent, and this paper from O’Keefe et al looks specifically at the question of mechanisms to share the windfall from TAI.

A big challenge when trying to work on these kinds of questions is finding projects that are well-scoped, empirical or based in something with a well established theory like law or economics, while still plausibly being useful.

Questions

In light of this, here are some post TAI governance questions that could fulfil these criteria:

 

 

 

 

  1. ^

     Excluding independent agencies which the President doesn’t have direct control over


CAISID @ 2024-03-12T14:53 (+6)

This is a useful list, thank you for writing it.

In terms of:

UK specific questions

  • Could the UK establish a new regulator for AI (similar to the Financial Conduct Authority or Environment Agency)? What structure should such an institution have? This question may be especially important because the UK civil service tends to hire generalists, in a way which could plausibly make UK AI policy substantially worse.  

I wrote some coverage here of this bill which seeks to do this, which may be useful for people exploring the above. Also well worth watching and not particularly well covered right now is how AUKUS will affect AI Governance internationally. I'm currently preparing a deeper dive on this as a post, but for people researching UK-specific governance it's a good head start to look at these areas as ones where not a lot of people are directing effort.

SummaryBot @ 2024-03-12T15:43 (+1)

Executive summary: This document compiles a list of research agendas and questions in AI governance that could be important for reducing existential risk, covering topics such as regulation, compute governance, corporate governance, international agreements, misuse, evaluation, China, information security, strategy and forecasting, and post-TAI governance.

Key points:

  1. AI regulation questions focus on estimating the effects of regulatory interventions, and exploring specific issues in the UK, US, and EU.
  2. Compute governance questions examine failure modes and alternative bottlenecks to AI development besides compute.
  3. Corporate governance questions look at the track record of governance tools like ethics boards, responsible scaling policies, and industry standards.
  4. International governance questions explore distinguishing features of effective treaties and drivers of regulatory arbitrage.
  5. Misuse questions focus on empirical evidence for the likelihood of existential risk from AI misuse by malicious actors.
  6. Evaluation questions consider how to ensure the integrity of AI system assessments and how to integrate governance.
  7. China questions examine the relative AI capabilities of China vs. the US/allies and the implications for racing dynamics and governance.
  8. Information security questions look at how infosec concerns affect firm behavior, deterrence of cyattacks, and offensive/defensive dynamics.
  9. Strategy and forecasting questions aim to anticipate AI progress while grounding predictions in solid empirical and theoretical methodology.
  10. Post-TAI governance questions explore issues like the legal status of digital minds and the threat of AI-enabled misinformation to democracy.

 

 

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

ceb @ 2024-03-12T12:41 (+1)

Skimmed and this looks like a good list, thanks for writing this up and sharing! I also appreciate you sharing your reasoning.

Minor question: I'm curious about your emphasis on statistical work, particularly wrt AI regulation and standards. I think mostly I'm unsure that the additional time and difficulty of doing robust statistical work would be worth it, relative to the current case study/best guess/experts' estimates approach. What am I missing?

Nathan_Barnard @ 2024-03-12T14:25 (+2)

Thanks for your comment Ceb. 

I think my case for more focus on good statistical work when looking at governance is that when doing good statistical work on interventions, we often find very high degrees of variation in effect sizes that are enough to justify the extra work of the intervention. I'm personally very unsure of the effect size of changes in liability law, soft law, and various corporate governance interventions on accident rates. 

There's been lots of great case study/best guess/expert consensus work on these questions which I think is often great and I'm very happy exists, but leaves me with large uncertainties about effect sizes, and so on the current margin, I want more good statistical work. 

I think the case for good statistical work on areas like degree of misuse risk and forecasting is stronger because I think that people's takes on these questions are pretty grounded in quite theoretical arguments (unlike governance interventions which are much more empirically grounded) that I think would benefit a lot from more grounded statistical work.