Question Writer, Math Benchmark (Contractor)

AI Safety

Sep 30

Epoch AI

📍Remote (Global) 🕔 Full or Part Time
💰$2K/month 🔄 Rolling Applications

We are looking for up to three question authors to contribute original and difficult mathematics problems to a novel mathematical problem-solving benchmark for AI systems.

The primary day-to-day activity of this contractor position is writing original math questions. It’s crucial for our purposes that the questions and their answers do not appear on the Internet. Question authors will also be expected to review the questions proposed by other question authors to ensure that the desired criteria of originality, difficulty and correctness are satisfied. Experimenting with existing AI systems to get a sense of their capabilities is also encouraged.

Successful candidates will report to Ege Erdil, senior researcher at Epoch AI. This role is fully remote, and we are able to contract in many countries.

We anticipate that we’ll have enough questions to complete the benchmark 4 to 6 months after the start of the project, and as such this is a contractor role that’s intended to be temporary. We will accept applications on a rolling basis, and close the contractor search when the positions are filled.

Key Responsibilities

Writing questions that will be included in the benchmark: this will be the core responsibility of all question authors.
Reviewing and validating questions written by other question authors before these questions can be included in the benchmark. This is essential to ensure a low rate of error in question statements and solutions.
Collaborating with other question authors when necessary, for example to ensure similar questions are not submitted by multiple authors or to generate better questions by working as a team.
Ensuring that the questions and their answers are not posted anywhere such that they would have a substantial chance of appearing in the training dataset of future AI models. Our guidelines require avoiding any platform that could store the questions or their answers on third-party servers: Slack, Discord, Google Drive, et cetera. Question authors will be expected to comply with these security guidelines.

What We Are Looking For

Requirements

At least a Ph.D. level background in mathematics. This condition doesn’t mean a viable candidate must have a Ph.D. in mathematics, but if they don’t, we expect candidates to provide independent proof of an equivalent level of background alongside their application.
Experience solving difficult math problems which require the use of programming. For example, past experience with solving recent Project Euler problems would suffice.
A basic level of proficiency with the Python programming language, as solutions for problems requiring programming will be written in Python.

Nice to Have

An impressive track record in renowned mathematics competitions. Examples include being an IMO gold medalist or a Putnam Fellow.
Past experience with writing questions for challenging math competitions such as the IMO, the Putnam, or the Miklós Schweitzer Competition.

What We Offer

Payment

Payment rate for this position consists of a flat $2,000 USD monthly payment for full-time contributors, plus a performance-based bonus determined by the number and difficulty of the questions delivered each month.
The difficulty of the questions will be assessed using standardized criteria shared before the contract’s signature. For the performance-based bonus, we offer rates starting from $300.00 USD per question for the easiest questions meeting our criteria and up to $1000.00 USD per question for the most difficult questions. Throughout the project’s duration, we anticipate each full-time question author to contribute a total of 30 to 100 questions to the benchmark.
Based on our observations during this project’s pilot, we anticipate that a question author working full-time should be able to receive a payment in the amount of approximately $10,000 each month, though this is only an average estimate.
Payments aren't restricted to USD and can be made in different currencies depending on the location of contributors.

Apply

Remote (Global)

Epoch AI

Epoch AI is a research institute that investigates trends in machine learning and the economic consequences of AI. Our work informs policy-making at key government institutes and governance at leading industry AI labs.

Question Writer, Math Benchmark (Contractor)

Epoch AI

Key Responsibilities

What We Are Looking For

What We Offer

GTM (Go to Market) Lead

Managing Director / Program Director / Community Manager