Personal Review of Participation in the Open Philanthropy Cause Exploration Prize

By Gavriel Kleinwaks @ 2022-11-02T06:05 (+30)

This is a linkpost to https://strainhardening.substack.com/p/personal-review-of-participation

Summary: The Open Phil Cause Exploration Prize was extremely top-heavy, which means that it might be more worth entrants’ time to work less hard and produce relatively lower-quality entries. Similar competitions should consider using less top-heavy structures. (Crossposted)

My colleagues and I recently submitted an entry to Open Philanthropy’s Cause Exploration Prize, and received an honorable mention, which came with a prize of $500. Open Philanthropy also awarded each good-faith entry with $200, awarded the first-place entry $25,000, and the three second-place entries with $15,000.

Our entry was the culmination of a few months of work, started for a different project, so the OP contest was not particularly out of our way. However, in order to do extra research and write and polish the draft, another colleague and I dedicated essentially all of our work hours to the CEP draft alone, and the third member of our team dedicated another significant percentage of his time. Counting only the two of us who worked on it full-time, I would estimate that we spent about two full work weeks on it—assuming a 50-60 hour workweek (call it 55), this means that we dedicated about 220 hours on this project alone^[1], not even counting our third author’s efforts or the research we had done for the project prior to deciding to work on the CEP. This, of course, a terrible value for money if the prize is $500, especially since the honorable mention is relatively close in quality to the other entries. (The first-place award is 50 times the honorable mentions awards, but the report isn’t 50 times as good as any random honorable mention I read. I’d be hard-pressed to compare the quality nebulously, but intuitively I’d call the first-place entry perhaps 2-3 times as good as a random honorable mention entry, depending on what the judges’ criteria were.)

That’s not to say I wasn’t excited to receive the award! I felt great about the recognition, but realistically, the main reward was that people liked the report and found it valuable. It definitely wouldn’t have been worth it to enter the contest if I thought writing the report would distract from my work.

The balance of considerations becomes more extreme when you consider the award for a good-faith entry. I was proud of the report we turned in, because the deadline was extended by a week, so we were able to do extra research and the report was far more thorough than it would have been otherwise. On the other hand, considered from a purely financial perspective, this was a terrible choice—assuming we didn’t have the extra week, and could only spend 110 hours on the draft, we still would have received the good-faith entry prize. But we also would have received the good-faith entry prize if had done zero extra work and exclusively committed to writing up our existing research at the time, without adding any extra research—let’s say that would have taken ~20 hours (and, to be clear, would have been a wildly low-quality report). We would have spent ~10% of the time to receive 40% of the prize. If it weren’t for the fact that this was already part of my job, this would actually be better time use!

So, okay, given that this was part of my job, was there really any issue? Was there a better way to spend our work hours, or did this stage of the project justify at least two full work weeks across two authors? I think yes, it was justified. We’re in the process of revising the report significantly to more closely fit our original project goal, but it was incredibly helpful to us to have a forcing deadline; we had yet to give ourselves a deadline to produce a polished draft of anything. We’re reusing a lot of the writing; I think that forgoing the OP contest would probably have saved about 20% of the time spent, or ~44 hours, not quite a full work week for one person, and not a bad time outlay given the benefits. We’ve since sent the report around to some collaborators to provide background on our project. Also, on a personal level, I figure it’s generally good for me to produce public writing; maybe the other authors would say the same. The work tradeoff mostly came in the form of anxious false starts, time spent just on polishing writing, and letting other work fall by the wayside; aside from ditching another project to work on this one, I was fairly burned out in the week following the sprint to submit (although that’s a risk for any project).^[2]

A prize competition structured like this one is weighted very heavily toward the ends—there’s a strong reason to put in tons of effort if you think you have a shot at winning first or second place; there’s also a decent reason to submit any good-faith entry regardless of quality.^[3] But given that most people don’t get first or second place, you should probably assume you won’t, and this might mean that you shouldn’t put much effort into your submission unless writing it is actually just your day job. Given that structure, probably lots of people enter reports related to their existing projects, which is good, but also maybe the competition organizers will have to spend a lot of time reading much lower-quality reports than might otherwise be produced. I can also say that outside of work, I would be generally more likely to enter a competition with a smoother prize gradient.

It would be a simple fix to restructure the prizes to be less top-heavy. For example: twenty honorable mentions were awarded, for a total of $10,000. If each were $1,000 instead of $500, they’d be five times the size of the good-faith entry awards. To avoid adding the $10,000 to the grantor’s cost, the top-tier prizes could each be reduced by $2,500 each, leaving even the second-place prizes at over 12 times the honorable mentions. I expect that in this and similar competitions, many of the honorable mentions were close to the winners in quality and usefulness, so that type of restructuring seems reasonable.

I wanted the main thrust of this post to be the specifics of the OP Cause Exploration Prize, but it’d be incomplete if I didn’t mention that my thinking about this specific prize has made me more concretely skeptical of essay/report prizes in general as great uses of time for either the judges or participants. Consider that if I’d spent 20 hours writing up our existing research into a report that would net the good-faith entry award, I’d receive $200, or $10/hr. It’s not totally fair to compare prize money to normal wages; prize organizers definitely get more participation than they could afford at normal research contractor rates. However, prize structures potentially discourage people with high opportunity costs from participating, and it seems easy for judges to get swamped with low-quality entries. I said above that I’d be more likely to enter a competition with a smoother prize gradient—this is true, but actually I’m pretty unlikely to use free time to enter a prize competition at all. This situation makes me skeptical of prizes as incentives for problem-solving, rather than as rewards for something already achieved for other reasons (like Nobels).

That said—whether a specific essay competition is a good use of time is, of course, up to the organizers to decide. I can easily imagine the organizers having priorities that favor the top-heavy structure because they only care that they receive a few great entries, and care less about the quality of everything below the top tier. And anyway, I’m all for organizations experimenting with new ways to get input and information.

Many thanks to Austin Chen for the idea about how to restructure prizes and also for listening to my description of the problem in the first place and saying, “Hey it sounds like you should write an EA Forum post about that.” Also, seriously, thanks to the Open Phil judges for awarding us an honorable mention—I was legitimately delighted.

^{^}
Those 220 hours don’t literally mean that all of those hours were spent on the report—my estimate of a 55-hour workweek includes various workday distractions. However, I feel very comfortable labeling it as 220 hours dedicated to the report, since all productive hours were spent on the report and unproductive/distracted time wouldn’t have been used for any other work anyway.
^{^}
One possible outside-of-work tradeoff is that prize money would make it a slightly bigger headache than normal to do my taxes, but my colleague and I agreed to donate the prize through OP’s prize claim portal. Since I was pretty sure we could agree on a charity we’d both be happy to donate to anyway, I specifically suggested the donation to avoid the 1099. Incentives!
^{^}
Open Phil did provide requirements for considering something to be a “good-faith” entry, the most onerous of which were that the entries needed to engage with relevant academic literature and should be at least 3,000 words. That’s certainly not nothing, but it can be done pretty quickly if you have research experience, and yet will not necessarily result in seriously informative work.

ChrisSmith @ 2022-11-02T09:16 (+14)

[Context - I managed the Cause Exploration Prizes]

Thank you Gavriel for taking the time to write this out and thank you again for your original submission on ways that philanthropic funders can help address indoor air quality, which I encourage others to check out. I'm really sorry to hear you felt burnt out after completing the entry.

Although essay prizes and contests are quite prevalent in the EA community, this was very much an experiment for global health and wellbeing cause prioritization team at Open Phil. A major objective of the lower value prizes and participation awards was to enable people to feel able to submit ideas that were good / on their mind, but not fully polished. Personally I think we were reasonably successful with that - but if we run anything similar again, we'll definitely consider a smoother gradient and tweaking other parameters.

Ben Stewart @ 2022-11-02T09:28 (+13)

Thanks for this - it seems honest and useful. I also enjoyed your entry! Some additional data from my own case, which may be helpful for analysing this (for context I won first prize):

I also probably spent around 2-3 full-time weeks total (but hard to judge as I did lots of little bits and then a big block of work - could be more, but not substantially less)
I entered the prize because I thought after a few hours reading/thinking that I had a decent (10-20%) chance of one of the top prizes. I probably would have entered the contest with approximately the same effort if both top and runner prizes were $5K less (so $20K and $10K). Less than that and I probably would have still entered but not tried as hard. A big motivator was that there were 3 generous runner up prizes - so I thought my odds of winning one of the major prizes were at least worth it. I have fairly good writing and generalist research skills and some experience in the EA context, but was new to my cause area (aside from a useful academic background in medicine/neuroscience + being familiar with the case for lead as a developmental neurotoxicant, which is the key precedent). My higher than base-rate hope for a major prize was probably due to a combination of believing in my idea, thinking that not that many others (<80) would enter, thinking that I was probably at least a little above average compared to the imagined average of entries, and some over-confidence.
After finishing I worried I had spent too much effort/time on it for it to be worth it, given the high likelihood of not winning a major prize.
I didn't have significant opportunity cost to prepare my entry. I was meant to be studying for my final exams, but they were still >6 weeks away and I had already decided to just aim to confidently pass, rather than attempt to do very well on them. So I had considerable freedom/time to work on the project. I also had high upside to participating in the contest - I did't have significant achievements or reputation in the community, nor a full-time role.
It's hard to judge my own work objectively and I can't confidently assign quality to the other good entries I read (I don't know OP's desiderata well enough nor the other entries' fields well enough) - plus there's plenty I didn't read beyond a skim. Having said that, I agree my entry is not close to 50X higher quality than a majority of the other entries (there's a few low-effort entries where 50X better is maybe not crazy, but I'm uncertain on this).
I haven't thought enough about prize structure to have a strong view. I'd imagine a top-heavy structure probably leads to a more heavy-tailed distribution - i.e. more lowish quality efforts, a few higher quality ones. Whether this distribution is good would depend on the purpose - for finding new cause areas I can see the motivation for a heavy-tailed distribution. For finding good arguments/criticisms I would probably want a less extreme distribution (if indeed that arises from less steep prize structures). A motivation there would be wanting a cluster approach to a problem.

Lizka @ 2022-11-02T13:34 (+8)

Thanks for writing this post! It seems really valuable. I also think it relates in interesting ways to a different recent post: How effective are prizes at spurring innovation?

One thing that stood out to me (emphasis mine):

"Hofstetter et al. (2017) ran an experiment by inviting a cohort of innovators to participate in two successive contests and randomly varied the incentive structure. Half of the participants were allocated into winner-take-all contests, and the other half into contests with a multiple prize structure in which the top 20 innovators would receive a prize (the total prize being identical across groups). The winner-take-all contests yielded significantly better ideas compared to multiple prizes in the first round. However, this result flipped when the innovators were asked to participate again in the second contest. While 50% of those in the multiple-prize contest group chose to participate again, only 37% did so from the winner-take-all group. Moreover, innovators who had received no reward in the first contest showed significantly lower effort in the second contest and generated fewer ideas. In the second contest, the multiple prizes contest generated better ideas than the winner-take-all contest. Confirming these findings, the authors found similar effects in an empirical investigation of over 260 contests and 6,000 innovators from the open innovation platform Atizo.com. Most importantly, these data show that innovator churn could be reduced by the addition of more (albeit smaller average) rewards."