Open Phil releases RFPs on LLM Benchmarks and Forecasting

By Lawrence Chan @ 2023-11-11T03:01 (+12)

This is a linkpost to https://www.openphilanthropy.org/research/rfps-on-llm-impacts/

As linked at the top of Ajeya's "do our RFPs accelerate LLM capabilities" post, Open Philanthropy (OP) recently released two requests for proposals (RFPs):

  1. An RFP on LLM agent benchmarks:  how do we accurately measure the real-world, impactful capabilities of LLM agents?
  2. An RFP on forecasting the real world-impacts of LLMs: how can we understand and predict the broader real-world impacts of LLMs?

Note that the first RFP is both significantly more detailed and has narrower scope than the second one, and OP recommends you apply for the LLM benchmark RFP if your project may be a fit for both. 

Brief details for each RFP below, though please read the RFPs for yourself if you plan to apply. 

Benchmarking LLM agents on consequential real-world tasks

Link to RFP: https://www.openphilanthropy.org/rfp-llm-benchmarks

We want to fund benchmarks that allow researchers starting from very different places to come to much greater agreement about whether extreme capabilities and risks are plausible in the near-term. If LLM agents score highly on these benchmarks, a skeptical expert should hopefully become much more open to the possibility that they could soon automate large swathes of important professions and/or pose catastrophic risks. And conversely, if they score poorly, an expert who is highly concerned about imminent catastrophic risk should hopefully reduce their level of concern for the time being.

In particular, they're looking for benchmarks with the following three desiderata:

Also, OP will do a virtual Q&A session for this RFP:

We will also be hosting a 90-minute webinar to answer questions about this RFP on Wednesday, November 29 at 10 AM Pacific / 1 PM Eastern (link to come).

Studying and forecasting the real-world impacts of systems built from LLMs

Link to RFP: https://www.openphilanthropy.org/rfp-llm-impacts/

This RFP is significantly less detailed, and primarily consists of a list of projects that OP may be willing to fund:

To this end, in addition to our request for proposals to create benchmarks for LLM agents, we are also seeking proposals for a wide variety of research projects which might shed light on what real-world impacts LLM systems could have over the next few years

Here's the full list of projects they think could make a strong proposal:

There's no Q&A session for this RFP.