Mikolaj Kniejski's Quick takes

By Mikolaj Kniejski @ 2024-11-25T20:18 (+3)

null
Mikolaj Kniejski @ 2024-11-25T20:18 (+10)

I’m working on a project to estimate the cost-effectiveness of AIS orgs, something like Animal Charity Evaluators does. This involves gathering data on metrics such as:

Some organizations (e.g., MATS, AISC) share impact analyses, there’s no broad comparison. AI safety orgs operate on diverse theories of change, making standardized evaluation tricky—but I think rough estimates could help with prioritization.

I’m looking for:

  1. Previous work
  2. Collaborators
  3. Feedback on the idea

If you have ideas for useful metrics or feedback on the approach, let me know!

Will Aldred @ 2024-11-25T21:23 (+5)

For previous work, I point you to @NunoSempere’s ‘Shallow evaluations of longtermist organizations,’ if you haven’t seen it already. (While Nuño didn’t focus on AI safety orgs specifically, I thought the post was excellent, and I imagine that the evaluation methods/approaches used can be learned from and applied to AI safety orgs.)

Mikolaj Kniejski @ 2024-11-25T23:30 (+4)

Thanks! I saw that post. It's an excellent approach. I'm planning to do something similar, but less time-consuming and limited. The range of theories of change that are pursued in AIS is limited and can be broken down into:

  • Evals
  • Field-building
  • Governance
  • Research

Evals can be measured by quality and number of evals, relevance to ex-risks. It seems pretty straightforward to differentiate a bad eval org from a good eval org—engaging with major labs, having a lot of evals, and a relation to existential risks.

Field-building—having a lot of participants who do awesome things after the project.

Research—I argue that the number of citations is also a good proxy for the impact of a paper. It's definitely easy to measure and is related to how much engagement a paper received. In the absence of any work done to bring the paper to the attention of key decision makers, it's very related to the engagement.

I'm not sure how to think about governance.

Take this with a grain of salt. 


EDIT: Also I think that engaging broader ML community with AI safety is extremely valuable and citations tells us how if an organization is good at that. Another thing that would be good to reivew is to ask about transparency of organizations, how thier estimate their own impact and so on - this space is really unexplored and this seems crazy to me. The amount of money that goes into AI safety is gigantic and it would be worth exploring what happens with it. 

Mikolaj Kniejski @ 2024-12-13T23:12 (+4)

Meta: I'm requesting feedback and gauging interest. I'm not a grantmaker.

You can use prediction markets to improve grantmaking. The assumption is that having accurate predictions about project outcomes benefits the grantmaking process.

Here’s how I imagine the protocol could work:

  1. Someone proposes an idea for a project.
  2. They apply for a grant and make specific, measurable predictions about the outcomes they aim to achieve.

Examples of grant proposals and predictions (taken from here):

A prediction market is created based on these proposed outcomes, conditional on the project receiving funding. Some of the potential grant money is staked to make people trade.

Obvious criticism is that:

Ozzie Gooen @ 2024-12-14T00:01 (+4)

I'm also a broad fan of this sort of direction, but have come to prefer some alternatives. Some points:
1. I believe of this is being done at OP. Some grantmakers make specific predictions, and some of those might be later evaluated. I think that these are mostly private. My impression is that people at OP believe that they have critical information that can't be made public, and I also assume it might be awkward to make any of this public.
2. Personally, I'd flag that making and resolving custom questions for each specific grant can be a lot of work. In comparison, it can be great when you can have general-purpose questions, like, "how much will this organization grow over time" or "based on a public ranking of the value of each org, where will this org be?"
3. While OP doesn't seem to make public prediction market questions on specific grants, they do sponsor Metaculus questions and similar on key strategic questions. For example, there are a tournaments on AI risk, bio, etc. I'm overall a fan of this. 

4. In the future, AI forecasters could do interesting things. OP could take the best ones, then these could make private forecasts of many elements of any program. 

Mikolaj Kniejski @ 2024-12-14T00:44 (+1)

Re 2. I agree that this is a lot of work but it's little given how much money goes into grants. Some of the predictions are also quite straightforward to resolve. 

Well, glad to hear that they are using it. 

I believe that an alternative could be funding a general direction, e.g., funding everything in AIS, but I don't think that these approaches are exclusive.