Join the AI Evaluation Tasks Bounty Hackathon

By Esben Kran, ryanbloom @ 2024-03-18T08:15 (+20)

How do we test when autonomous AI might become a catastrophic risk? One approach is to assess the capabilities of current AI systems in performing tasks relevant to self-replication and R&D. METR (formerly ARC Evals), a research group focused on this question, has:

Now, you have the chance to directly contribute to this important AI safety research. We invite you to join the Code Red Hackathon, an event hosted by Apart in collaboration with METR, where you can earn money, connect with experts, and help create tasks to evaluate frontier AI systems. Sign up here for the event this weekend on March 22-24!

A short introduction to testing AI

The risks associated with misuse of capable, autonomous AI are significant. By creating "tasks"[1] for frontier models we can test some of the capabilities relevant to autonomous self-replication and R&D. Example tasks might include:

As you can see, if an AI possesses these abilities, things might get complicated.

The Task Standard provides a plug-and-play early warning system for these abilities and follows a standardized format. A task family (a set of tasks) consists of:

  1. A Python file called $TASK_FAMILY_NAME.py;
  2. Any number of other Python files, shell scripts, etc. that $TASK_FAMILY_NAME.py imports; and
  3. Other files, called "assets", that will be used to set up the task environment.

When creating a task, it's crucial to ensure that the task is error-free, understandable for the agent, and not easily gameable. You can follow these steps, some awarded by METR, to create a great task:

  1. Write up your ideas for which tasks related to autonomous capabilities you wish to test the language model on
    • A $20 prize will be awarded for high-quality ideas
  2. Create a specification for the task that includes the prompt, a text description of what the test-taker has access to, and a self-evaluation of the task
    • A $200 prize will be awarded for high-quality specifications (2-6 hours of work)
  3. Create the materials for the task (instructions, libraries, and tool access) and have a human run through the whole task with these exact materials and tools
  4. Implement the task in the task standard format, test it with a simple agent, and submit it!
    • The prize for high-quality implementations is 3x a human professional's salary for the task + bonus, e.g. a task that would take a human software engineer 10 hours could net you up to $4500 (6-12 hours of work in addition to quality assurance)

Each of these steps can be found detailed in the associated resources for the hackathon found on the hackathon website.

Joining the hackathon: Your chance to contribute

You might find creating an AI evaluation task daunting, but the Code Red Hackathon provides the perfect opportunity to dive in, with support from experts, clear guidelines, and the chance to earn significant money for your work. By joining us on March 22-24, you can:

The Code Red Hackathon is a unique opportunity to contribute to critical AI safety research, connect with like-minded individuals, and potentially shape AI development. We encourage anyone passionate about AI safety to join us on March 22-24 and be part of this groundbreaking effort. Sign up now and let's work together to ensure a safer future for AI.

In addition to the Code Red Hackathon, Apart runs the Apart Lab fellowship, publishes original research, and hosts other research sprints. These initiatives aim to incubate research teams with an optimistic and action-focused approach to AI safety.

Extra tips for participants

The hackathon is designed to let people at all levels of technical experience meaningfully contribute to AI safety research. Keep these suggestions in mind to make the most of your experience:

Remember, the hackathon is a collaborative effort – don't hesitate to reach out to other participants and the organizing team for feedback and support throughout the weekend. We're all here to help each other!

  1. ^

     A task in this context is a piece of code and supporting resources that makes an agent able to run a task (such as extracting a password from a compiled program with varying levels of obfuscation) and be evaluated for its performance on said task. Read more.