Should 80,000 Hours be more transparent about how they rank problems and careers?

By Vasco Grilo @ 2023-12-20T08:14 (+89)

Question

I wonder whether 80,000 Hours should be more transparent about how they rank problems and careers. I think so:

I understand the rankings are informed by 80,000 Hours' research process and principles, but I would also like to have a mechanistic understanding of how the rankings are produced. For example, do the rankings result from aggregating the personal ratings of some people working at and advising 80,000 Hours? If so, who, and how much weight does each person have? May this type of information be an infohazard? If yes, why?

In any case, I am glad 80,000 Hours does have rankings. The current ones are presented as follows:

Side note

80,000 Hours' is great! It was my entry point to effective altruism in early 2019 via the slide below, where following its advice was being presented as the opposite of doing frivolous research.

The slide was presented in one of the last classes of the course Theory and Methodology of Science (Natural and Technological Science), which I did during my Erasmus studies at KTH. I did not check 80,000 Hours' website after class. However, a few months later I came across the slide again studying for the exam. Maybe because I was a little bored, I decided to search for 80,000 Hours that time. I remember I found the ideas so interesting that I thought to myself I had better look into them with more peace of mind later, in order not to get distracted from the exams.


Ardenlk @ 2023-12-20T14:31 (+41)

Hey Vasco —

Thanks for your interest and also for raising this with us before you posted so I could post this response quickly!

I think you are asking about the first of these, but I'm going to include a few notes on the 2nd and 3rd too as well just in case, as there's a way of hearing your question as about them. 

  1. What is the internal process by which these rankings are produced and where do you describe it? 
  2. What are problems and paths being ranked by? What does the ranking mean?
  3. Where is our reasoning for why we rank each problem or path the way we do? 

We've written some about these things on our site. We’re on the lookout for ways to improve our processes and how we communicate about them (e.g. I updated our research principles and process page this year and would be happy to add more info if it seemed important. If some of the additional notes below seem like they should be included that'd be helpful to hear.) 

Here's a summary of what we say now with some additional notes:

On (1):

Our "Research principles and process" page is the best place to look for an overview, but it doesn't describe everything. 

I'll quote a few relevant bits here:

> Though most of our articles have a primary author, they are always reviewed by other members of the team before publication.

> For major research, we send drafts to several external researchers and people with experience in the area for feedback.

> We seek to proactively gather feedback on our most central positions — in particular, our views on the most pressing global problems and the career paths that have the highest potential for impact, via regularly surveying domain experts and generalist advisors who share our values.

> For some important questions, we assign a point person to gather input from inside and outside 80,000 Hours and determine our institutional position. For example, we do this with our list of the world’s most pressing problems, our page on the top most promising career paths, and some controversial topics, like whether to work at an AI lab. Ultimately, there is no formula for how to combine this input, so we make judgement calls [...] Final editorial calls on what goes on the website lie with our website director. [me, Arden]

> Finally, many of our articles are authored by outside experts. We still always review the articles ourselves to try to spot errors and ensure we buy the arguments being made by the author, but we defer to the author on the research (though we may update the article substantively later to keep it current).

Here are some additional details that aren't on the page:

To reply to your specific question about aggregating people's personal rankings: no, we don't do any formal sort of 'voting' system like that. The problems and paths rankings are informed by the views of the staff at 80,000 Hours and external advisors via surveys where I elicit people's personal rankings, and lots of ongoing internal discussion, but I am the "point person" for ultimately deciding how to combine this information into a ranking. In practice, this means my views can be expected to have an outsized influence, but I put a lot of emphasis on takes from others and aim for the lists to be something 80,000 Hours as an organisation can stand behind. Another big factor is what the lists were before, which I tend to view as a prior to update from, and which were informed by the research we did in the past and the views of people like like Ben Todd, Howie Lempel, and Rob Wiblin.

Our process has evolved over the years, and, for example, the formal "point person" system described above is recent as of this year (though it was informally something a bit like that before). I expect it'll continue to change, and hopefully improve, especially as we grow the team (right now we have only 2 research staff).

Sometimes it's been a while since we've looked at a problem or path, and we decide to re-do the article on it. That might trigger a change in ranking if we discover something that changes our minds.

More often we adjust the rankings over time without necessarily first re-doing the articles, often in response to surveys of advisors and team members, feedback we get, or events in the world. This might then trigger looking more into something and adding or re-doing a relevant article. 

The rankings are not nearly as formal or quantitative as, e.g. the cost-effectiveness analyses that GiveWell performs of its top charities. Though previous versions of the site have included numerical weightings to something like the problem profiles list, we’ve moved away from that practice. We didn’t think the BOTECs and estimations that generated these kinds of numbers were actually driving our views, and the numbers they produced seemed like they suggested a misleading sense of precision. Ranking problems and career paths is messy and we aren't able to be precise. We discuss our level of certainty in e.g. the problem profiles FAQ and at the end of the reserach principles page and try to reflect it in the language on the problems and career path pages. 

As you noted, when we make a big change, like adding a new career path to the priority paths, we try to announce it in some prominent form, though we don't always end up thinking it's worth it. E.g. we sent a newsletter in April explaining why we now consider infosec to be a priority path. We made a similar announcement when we added AI hardware expertise to the priority paths. Our process for this isn't very systematic.

On (2): 

For problems: In EA shorthand, the ranking is via the ITN framework. We try to describe that in a more accessible / short way at the top of the page in the passage you quoted.

We also have an FAQ which talks a bit more about it.

For career paths it is slightly more complicated. A factor we weren't able to fit into the passage you quoted is: we also down-rank paths if they are super narrow/most people can't follow them (or don't write about them at all) – e.g. becoming a public intellectual (or to take an extreme example, becoming president of the US.)

On (3):

For the most part, we want the articles themselves to explain our reasoning – in each problem profile or career review, we say why we think it's as pressing / promising as we think it is. 

We also draw on surveys of 80k staff + external advisors to additionally help determine and adjust the ranking over time, as described above. We don't publish these surveys, but we describe the general type of person we tend to ask for input here.


Best,
Arden

Guy Raveh @ 2023-12-22T23:19 (+6)

Hi Arden, thanks for engaging like this on the forum!

Re: "the general type of person we tend to ask for input" - how do you treat the tradeoff between your advisors holding the values of longtermist effective altruism, and them being domain experts in the areas you recommend? (Of course, some people are both - but there are many insightful experts outside EA).

Ardenlk @ 2023-12-24T12:37 (+10)

This is a good question -- we don't have a formal approach here, and I personally think that in general, it's quite a hard problem who to ask for advice.

A few things to say:

  • the ideal is often to have both.

  • the bottleneck on getting more people with domain expertise is more often us not having people in our network with sufficient expertise, that we know about and believe are highly credible, and who are willing to give us their time, rather than their values. People who share our values tend to be more excited to work with us.

  • it depends a lot on the subject matter we are asking about. e.g. if it's an article about how to become a great software engineer, we don't care so much about the person's values; we care about their software engineering credentials. If it's e.g. an article about how to balance doing good and doing what you love, we care a lot more about their values

Vasco Grilo @ 2023-12-23T06:47 (+9)

I like that question, Guy. Note 80,000 Hours lists their external advisors on their website. The list only has 6 people (Dr Greg Lewis, Dr Rohin Shah, Dr Toby Ord, Prof. Hilary Greaves, Peter Hartree and Alex Lawsen), and all are quite connected to effective altruism and longtermism. Arden, are these all the external advisors you were referring to in your comment?

Ardenlk @ 2023-12-24T12:38 (+5)

No, we have lots of external advisors that aren't listed on our site. There are a few reasons we might not list people, including:

  • We might not want to be committed to asking for someone's advice for a long time or need to remove them at some point.

  • The person might be happy to help us and give input but not want to be featured on our site.

  • It's work to add people, and we often will reach out to someone in our network fairly quickly and informally, and it would feel like overkill / too much friction to get a bio, and get permission from them for it, on our site for them because we asked them a few questions.

  • Also, there are too many people we get takes from over the course of e.g. a few years to list in a way that would give context and not require substantial person-hours of upkeep. So instead we just list some representative advisors who give us input on key subject matters we work on and where they have notable expertise.

Ulrik Horn @ 2023-12-21T10:48 (+3)

or to take an extreme example, becoming president of the US
 

What is your thinking for not including this? I am asking as there might be people (you know better than me!) that might think it worthwhile to pursue this career even if it to them has a 0.01% chance of success. I am asking as there is existing EA advice about being ambitious, but is there advice that I have not seen about not being too ambitious? I feel like many people might "qualify" for becoming a president even if the chance of "making it" is low, so in one way it is perhaps not that narrow (even if there is only one 1st place). And on the way to this goal, people are likely to be managing large pots of money and/or making impactful policy more likely to happen.

Ardenlk @ 2023-12-24T12:38 (+7)

I agree that it might be worthwhile to try to become the president of the US - but that wouldn't mean it's best for us to have an article on it, especially highly ranked. that takes real estate on our site, attention from readers, and time. This specific path is a sub-category of political careers, which we have several articles on. In the end, it is not possible for us to have profiles on every path that is potentially worthwhile for someone. My take is that it's better for us to prioritise options where the described endpoint is achievable for at least a healthy handful of readers.

Vasco Grilo @ 2023-12-20T18:44 (+3)

Thanks for the comprehensive reply, Arden!

Thanks for your interest and also for raising this with us before you posted so I could post this response quickly!

Thanks for sharing the 1st version of your answer too, which prompted me to add a little more detail about what I was asking in the post.

If some of the additional notes below seem like they should be included that'd be helpful to hear.

I think it would be valuable to include all the additional notes which are not on your website. As a minimum viable product, you may want to link to your comment.

To reply to your specific question about aggregating people's personal rankings: no, we don't do any formal sort of 'voting' system like that. The problems and paths rankings are informed by the views of the staff at 80,000 Hours and external advisors via surveys where I elicit people's personal rankings, and lots of ongoing internal discussion, but I am the "point person" for ultimately deciding how to combine this information into a ranking. In practice, this means my views can be expected to have an outsized influence, but I put a lot of emphasis on takes from others and aim for the lists to be something 80,000 Hours as an organisation can stand behind.

Thanks for sharing! The approach you are following seems to be analogous to what happens in the broader society, where there is often one single person responsible for informally aggregating various views. Using a formal aggregation method is the norm in forecasting circles. However, there are often many forecasts to be aggregated, so informal aggregation would hardly be feasible for most cases. On the other hand, Samotsvety, "a group of forecasters with a great track record", also uses formal aggregation methods. I am not aware of research comparing informal to formal aggregation of a few forecasts, so there might not be a strong case either way. In any case, I encourage you to try formal aggregation to see if you arrive to meaningfully different results.

Another big factor is what the lists were before, which I tend to view as a prior to update from, and which were informed by the research we did in the past and the views of people like like Ben Todd, Howie Lempel, and Rob Wiblin.

Makes sense.

The rankings are not nearly as formal or quantitative as, e.g. the cost-effectiveness analyses that GiveWell performs of its top charities. Though previous versions of the site have included numerical weightings to something like the problem profiles list, we’ve moved away from that practice. We didn’t think the BOTECs and estimations that generated these kinds of numbers were actually driving our views, and the numbers they produced seemed like they suggested a misleading sense of precision.

Your previous quantitative framework was equivalent to a weighted-factor model (WFM) with the logarithms of importance, tractability and neglectedness as factors with the same weight, such that the sum respects the logarithm of the cost-effectiveness. Have you considered trying a WFM with the factors that actually drive your views?

Ardenlk @ 2023-12-24T12:38 (+4)

?I think it would be valuable to include all the additional notes which are not on your website. As a minimum viable product, you may want to link to your comment.

Thanks for your feedback here!

Your previous quantitative framework was equivalent to a weighted-factor model (WFM) with the logarithms of importance, tractability and neglectedness as factors with the same weight, such the sum respects the logarithm of the cost-effectiveness. Have you considered trying a WFM with the factors that actually drive your views?

I feel unsure about whether we should be trying to do another WFM at some point. There are a lot of ways we can improve our advice, and I’m not sure this should be at the top of our list but perhaps if/when we have more research capacity. I'd also guess it would still have the problem of giving a misleading sense of precision, so it’s not clear how much of an improvement it would be. But it is certainly true that the ITN framework substantially drives our views.

Mo Putera @ 2023-12-20T14:16 (+8)

Aside from transparency, I'd also be interested in rough BOTECs for career paths, with all the usual caveats regarding impact estimates. Like this (which you contributed to), but incorporating whatever additional work Nuño, Sam and Alex think are needed to make the estimates decision-relevant, as well as risk aversion (I'm thinking of RP's CCM).

Vasco Grilo @ 2023-12-20T15:28 (+1)

Thanks for the comment, Mo! I would be interested in that too.

mhendric @ 2023-12-20T12:05 (+4)

I, too, would be happy to see more transparency about the 80.000 hour rankings. I think it would be especially valuable to see to which degree they reflect the individual judgment of decision-makers. I would also be interested in whether they take into account recent discussions/criticisms of model choices in longtermist math that strike me as especially important for the kind of advising 80.000 hours does (tldr: I take one crux of that article to be that longtermist benefits by individual action are often overstated, because the great benefits longtermism advertises require both reducing risk and keeping overall risk down long-term, which plausibly exceeds the scope of a career/life). 

I think this would help me with a more general worry I have, and maybe others share. As a teacher at a university, I often try to encourage students to rethink their career choices from an EA angle. 80.000 hours is a natural place to recommend for interested students, but I am wary of recommending it to non-longtermist students. Probably good seems to offer a more shorttermist alternative, but are significantly newer and have less brand recognition. I think there would be considerable value in having the biggest career-advising organization (80k) be a non-partisan EA advising organization, whereas I currently take them to be strongly favoring longtermism in their advice. While I feel this explicit stance is a mistake, I feel like getting a better grasp on its motivation would help me understand why it was taken.

I may be mistaken in taking 80.000 hours to lean heavily longtermist and Probably Good leaning heavily shorttermist, and would be happy to be corrected!

Ardenlk @ 2023-12-24T12:38 (+12)

I think it would be especially valuable to see to which degree they reflect the individual judgment of decision-makers.

The comment above hopefully helps address this.

I would also be interested in whether they take into account recent discussions/criticisms of model choices in longtermist math that strike me as especially important for the kind of advising 80.000 hours does (tldr: I take one crux of that article to be that longtermist benefits by individual action are often overstated, because the great benefits longtermism advertises require both reducing risk and keeping overall risk down long-term, which plausibly exceeds the scope of a career/life).

We did discuss this internally in slack (prompted by David's podcatst https://critiquesofea.podbean.com/e/astronomical-value-existential-risk-and-billionaires-with-david-thorstad/). My take was that the arguments don't mean that reducing existential risk isn't very valuable, even though they do imply it's likely not of 'astronomical' value. So e.g. it's not as if you can ignore all other considerations and treat "whether this will reduce existential risk" as a full substitute for whether something is a top priority. I agree with that.

We do generally agree that many questions in global priorities research remain open — that’s why we recommend some of our readers pursue careers in this area. We’re open to the possibility that new developments in this field could substantially change our views.

I think there would be considerable value in having the biggest career-advising organization (80k) be a non-partisan EA advising organization, whereas I currently take them to be strongly favoring longtermism in their advice. While I feel this explicit stance is a mistake, I feel like getting a better grasp on its motivation would help me understand why it was taken.

We're not trying to be 'partisan', for what it's worth. There might be a temptation to sometimes see longtermism and neartermism as different camps, but what we're trying to do is just figure out all things considered what we think is most pressing / promising and communicate that to readers. We tend to think that propensity to affect the long-run future is a key way in which an issue can be extremely pressing (which we explain in our longtermism article.)

mhendric @ 2023-12-24T15:42 (+9)

Hey there, thank you both for the helpful comments.

I agree the shorttermist/longtermist framing shouldn't be understood as too deep a divide or too reductive a category, but I think it serves a decent purpose for making clear a distinction between different foci in EA (e.g. Global Health/Factory Farming vs AI-Risk/Biosecurity etc). 

The comment above really helped me in seeing how prioritization decisions are made. Thank you for that, Ardenlk!
 

I'm a bit less bullish than Vasco on it being good that 80k does their own prioritization work. I don't think it is bad per se, but I am not sure what is gained by 80k research on the topic vis a vis other EA people trying to figure out prioritization. I do worry that what is lost are advocates/recomendations for causes that are not currently well-represented in the opinion of the research team, but that are well-represented among other EA's more broadly. This makes people like me have a harder time funneling folks to EA-principles based career-advising, as I'd be worried the advice they receive would not be representative of the considerations of EA folks, broadly construed. Again, I realize I may be overly worried here, and I'd be happy to be corrected!


I read the Thorstadt critique as somewhat stronger than the summary you give- certainly, just invoking X-risk should not per default justify assuming astronomical value. But my sense from the two examples (one from Bostrom, one on cost-effectiveness on Biorisk) was that more plausible modeling assumptions seriously undercut at least some current cost-effectiveness models in that space, particularly for individual interventions (as opposed to e.g. systemic interventions that plausibly reduce risk long-term). I did not take it to imply that risk-reduction is not a worthwhile cause, but that current models seem to arrive at the dominance of it as a cause based on implausible assumptions (e.g. about background risk).

I think my perception of 80k as "partisan" stems from posts such as these, as well as the deprioritization of global health/animal welfare reflected on the website. If I read the post right, the four positive examples are all on longtermist causes, including one person who shifted from global health to longtermist causes after interacting with 80k. I don't mean to suggest that in any of these cases, that should not have been done - I merely notice that the only appearance of global health or animal welfare is in that one example of someone who seems to have been moved away from those causes to a longtermist cause. 

I may be reading too much into this. If you have any data (or even guesses) on how many % of people you advise you end up funneling to global health and animal welfare causes, and how many you advise to go into risk-reduction broadly construed, that would be really helpful.

 

Vasco Grilo @ 2023-12-20T13:59 (+3)

Thanks for engaging, mhendric!

I would also be interested in whether they take into account recent discussions/criticisms of model choices in longtermist math that strike me as especially important for the kind of advising 80.000 hours does

My guess is that 80,000 Hours[1] is aware of these, but I would be curious to know the extent to which their longtermism article discusses such concerns. I added it to my list, but feel free to have a look yourself!

As a teacher at a university, I often try to encourage students to rethink their career choices from an EA angle.

Great that you do this!

80.000 hours is a natural place to recommend for interested students, but I am wary of recommending it to non-longtermist students. Probably good seems to offer a more shorttermist alternative, but are significantly newer and have less brand recognition. I think there would be considerable value in having the biggest career-advising organization (80k) be a non-partisan EA advising organization, whereas I currently take them to be strongly favoring longtermism in their advice.

I would say it makes sense for 80,000 Hours to tailor their advice to what they consider are the most pressing problems. On the other hand, I think it may well be the case that 80,000 Hours is overestimating the difference between the pressingness of areas traditionally classified as longtermist and neartermist[2]. Rather than picking one of these 2 views, I wonder 80,000 Hours' had better rank problems along various metrics covering a wider range of areas. For example:

  • Increasing welfare in the next few decades. I expect improving the welfare of farmed animals would come out on top here.
  • Boosting economic growth in the next few decades. I guess global health and development as well as high-leverage ways to speed up economic growth would be strong candidates to come out on top here.
  • Decreasing global catastrophic risk in the next few decades. I guess decreasing biorisk would come out on top here.
  • Decreasing extinction risk this century. I guess decreasing AI risk would come out on top here.
  • Improving positive values. This is important, but vague, and applicable to many areas due to indirect effects, so producing a ranking would be difficult.

I believe having neartermism and longtermism as the only 2 categories would be overly reductive, as traditionally neartermist interventions have longterm effects (e.g. economic growth), and traditionally longtermist interventions have nearterm effects (e.g. less deaths of people currently alive). Furthermore, the above decomposition may mitigate a problem I see in 80,000 Hours' current ranking of problems:

  • Climate change is currently ranked as a top problem in 5th, whereas animal welfare and global health and development are not top problems.
  • However, I think animal welfare and global health and development have a greater chance than climate change of topping one of the above 5 rankings. So I would say they should be prioritised over climate change. 

I strongly endorse expectactional total hedonistic utilitarianism, and therefore agree that all metrics can in theory be mapped to a single dimension, such that all problems could be ranked together. However, doing this in practice is difficul because there is lots of uncertainty.

  1. ^

    Nitpick, there is a comma after "80", not a dot.

  2. ^

    I have not seen the term shortermist being used much.

Pat Myron @ 2023-12-20T20:16 (+3)

Small similar thread:
https://forum.effectivealtruism.org/posts/oZff425xLnikfxeGD/pat-myron-s-shortform?commentId=QmoWZXrGxDZxsiQrk

Vasco Grilo @ 2023-12-20T21:49 (+2)

Thanks, Pat! You and other readers may be interested in Probably Good's list of impact-focused job boards.