How can we make Our World in Data more useful to the EA community?

By EdMathieu @ 2021-11-04T14:02 (+148)

I work at Our World in Data, where we try to make research and data on the world's largest problems more accessible and understandable.

I attended EA Global this past weekend, where I received very interesting input from many lovely people on potential improvements. But I thought it'd also be worth asking here to get wider feedback. I'm interested in all the following:

Thank you!


Nathan Young @ 2021-11-04T15:46 (+72)

Embed forecasts in your pages.

Work with metaculus to have forecasts on the future as well as past values.

MichaelStJules @ 2021-11-05T03:03 (+49)

It would be useful for animal advocates to have your figures for population sizes in terms of numbers of individual animals, not just weight. I'm thinking pages like:

https://ourworldindata.org/mammals

https://ourworldindata.org/birds

https://ourworldindata.org/fish-and-overfishing

Pages for invertebrates (wild and farmed) would be nice, too!

Jacob_Peacock @ 2021-11-09T18:47 (+24)

+1 As well. I would emphasize that number of animal alive at any given time is significantly more important than slaughter as many animals die prior to slaughter.

Neil_Dullaghan @ 2021-11-05T11:37 (+13)

+1 to reporting numbers of animals instead of tonnage or biomass. The OWID meat and dairy production page does have a "numbers of animals slaughtered" section, so it would be great for that to be expanded both to other large numbers of animals (like various fish species, crustaceans, invertebrates) and beyond slaughter (such as alive at any one time).

Here are some articles with sources of such data. I haven't really looked into how hard they would be to maintain and update. It is biased towards data collected by Rethink Priorities staff because it was the easiest I had to hand, but hopefully others can add anything major I missed:

NunoSempere @ 2021-11-04T16:16 (+42)

Collaborate with Jaime Sevilla on datasets for various values related to size, performance, training expense, etc. of large machine learning models. 

Having high quality data on this which one knows is going to be maintained makes it much easier to elicit forecasts about these topics, and eventually resolve those forecasts and keep track of track-records, and I know that Jaime has been working on this.

EdMathieu @ 2022-02-15T20:55 (+4)

We now have a first chart based on their pre-print here: Estimated computation used in large training runs of AI systems

NunoSempere @ 2022-02-16T16:46 (+1)

Wohoo, nice!

Darius_M @ 2021-11-04T17:19 (+41)

Create a page on biological weapons. This could include, for instance,

  1. An overview of offensive BW programs over time (when they were started, stopped, funding, staffing, etc.; perhaps with a separate section on the Soviet BW program)
  2. An overview of different international treaties relating to BW, including timelines and membership over time (i.e., the Geneva Protocol, the Biological Weapons Convention (BWC), Australia Group, UN Security Council Resolution 1540)
  3. Submissions of Confidence-Building Measures in the BWC over time (including as a percentage of the # of BWC States Parties and split in publicly-accessible and restricted-access) 
  4. A graph that visually compares the funding and # of staff in international organizations for the bioweapons regime compared to chemical and nuclear weapons (e.g., the BWC Implentation Support Unit compared to OPCW for chemical and the IAEA and CTBTO PrepCom for nuclear)
  5. (Perhaps include an overview on the global proliferation of high-biosafety labs, e.g. see Global Biolabs)
  6. (Perhaps include a section on how technological advancements may affect the BW threat, e.g., include a graph on the Carlson curve (Moore's law but for DNA sequencing))
MichaelA @ 2021-11-05T09:07 (+17)

(This does sound useful, though I'd note this is also a relatively sensitive area and OWID are - thankfully! - a quite prominent site, so OWID may wish to check in with global catastrophic biorisk researchers regarding whether anything they'd intend to include on such a page might be best left out.)

EdMathieu @ 2021-11-04T19:15 (+12)

Thank you! Just FYI, on (6) we have:

Pablo @ 2021-11-04T19:38 (+3)

Data on gene synthesis costs would also be valuable.

EdMathieu @ 2022-02-15T20:57 (+3)

Not much yet, but on (5) we now have this world map: Number of biosafety level 4 facilities

Nathan Young @ 2021-11-04T15:43 (+39)

Datasets on philanthropic funding.

How much are big donors giving where? Can it be easy for there to be a searchable database of projects.

MichaelA @ 2021-11-05T09:13 (+9)

I like this idea. 

I tried to figure out how much funding there is for nuclear risk stuff, but it seems like the best source of data is the Peace and Security Funding Map tracking of spending on "Preventing and Mitigating Conflict > Nuclear Issues" (select "List"), and many grants that tracks really aren't about nuclear weapons issues but just happen to use the term "nuclear". These include medical research grants that use the term "nuclear" in a totally different sense and biosecurity-related grants to the Nuclear Threat Initiative. They also don't provide graphs of changes over time or breakdowns into categories or anything like that; you'd have to build that manually. 

This makes it harder to know how neglected areas are, how this is changing over time, who the big players to learn from or complement or bear in mind are, etc. 

I'd imagine the situation is similar in other longtermist areas, though I'm unsure.

John G. Halstead @ 2021-11-05T12:27 (+37)

Just want to say as an EA researcher your website is an absolute godsend.

EdMathieu @ 2021-11-05T13:12 (+1)

Glad to hear that :)

MichaelA @ 2021-11-05T09:43 (+25)

Thanks for making this post! I think OWID is excellent, and I'm really excited that you're interested in making it even more useful to the EA community.

One thing I'd note: I expect OWID might be able to get funding from the EA community for high-impact moves, if such funding would be helpful. See List of EA funding opportunities and Things I often tell people about applying to EA Funds

(Let me know if there's a way I can be helpful in working out what funders might be most relevant, what sort of funding proposals might make sense, etc.)

Khorton @ 2021-11-05T12:38 (+21)

I know you've been campaigning for open access climate data - I'd be really excited for that. My policy team found your work on access to electricity pretty interesting.

kyle_fish @ 2021-11-04T14:50 (+19)

A few things that jump to mind:

Nathan Young @ 2021-11-04T15:44 (+16)

Talk to Hamish Huggard who made EffectiveAltruismData.com he seems really switched on. More here https://forum.effectivealtruism.org/posts/CQaNyJfsRFZhseiLZ/effectivealtruismdata-com-a-website-for-aggregating-and

evelynciara @ 2021-11-05T10:49 (+14)

Data on housing and transportation:

MichaelA @ 2021-11-05T09:23 (+13)

Info on nuclear yields

(This is a bit niche, but might also be quick to produce and could fit into the existing nuclear weapons article.)

As far as I’m aware, there’s no compilation of information related to the yields of various state’s arsenals that rivals the compilations of information on numbers of warheads created by the people such as the Federation of American Scientists. (Though I didn’t look very hard, so please tell me if I’m wrong! In any case, OWID-style visualiations would be handy too.)

I think that creating such a compilation would be at least somewhat useful (though I’m not sure how useful), e.g. for forecasting future changes. 

More specifically, I’d like to be able to easily find answers to questions like:

I expect I could find answers or form educated guesses on each of those questions with some work, but it’d be nice to have the info compiled and organised already.

Scraps of info that I happen to already have include:

(I originally proposed people on Metaculus create this, but I expect that won't happen and OWID seems ideally suited for this.)

Scott Alexander @ 2021-11-06T03:06 (+12)

Thanks for asking!

On some of your graphs, eg https://ourworldindata.org/grapher/gdp-per-capita-maddison-2020, you have a box you can tick to get "relative change". On other graphs, eg https://ourworldindata.org/grapher/children-per-woman-un?tab=chart&time=1950..2015&country=OWID_WRL~HUN, you don't have that box. You can force the chart to do this by adding "?stackMode=relative" to the URL, but that is annoying and hard to remember. Please add the box to all graphs.

If you generate a graph like https://ourworldindata.org/grapher/children-per-woman-un?tab=chart&time=2008..2015&country=HUN~AUT~CZE~SVK~POL~UKR~HRV~SRB , it's hard to see what's going on, because all of the action is crammed into a tiny part of the graph - in this case between 1.3 and 1.6 children. I would be interested in either having it autozoom to the part where things are happening, or at least have an option to zoom into that part. Maybe this already exists and I am just missing it.

Another thing that would be neat (though a lot of work for maybe not much gain) would be the ability to graph algorithms, eg the fertility rate of Hungary minus the fertility rate of Austria, over time.

MichaelA @ 2021-11-05T09:20 (+12)

Civilizational collapse data

Anders Sandberg at FHI looked into this to some extent (see a talk here and slides here). I could connect you with him if that'd be helpful.

Some other global catastrophic risk researchers may also have looked into this somewhat, e.g. perhaps Luke Kemp, Luisa Rodriguez, Haydn Belfield, or Karim Jebari. Again, I could probably provide intros if helpful.

I imagine various people outside the global catastrophic risk community have also looked into this somewhat, e.g. Jared Diamond and Peter Turchin.

Lizka @ 2022-06-08T11:10 (+6)

This has been recently brought up again, alongside individual species extinctions.

Nathan Young @ 2021-11-04T15:45 (+12)

Funding by different cause. Cancer, heart disease deworming etc.

finm @ 2021-11-06T15:30 (+11)

Here's a few things I'd like to see —

First, working with Hamish Huggard to port over some of the data from effectivealtruismdata.com (as Nathan Young suggests). In particular, I think it would be useful to have a better impression of how EA and EA-adjacent philanthropic money gets spent.

Second, some charts covering long-run trends, such as: GDP over time starting around 0AD, world temperature or CO2 concentration since the end of the last ice age, agricultural production over time, energy consumption per capita, or population over time over millennia (sorry if you already have some of this). Obviously (with the exception of the climate stuff) the data is very sparse on this, but I am pro "here's a reasonable guess and here's how wide our uncertainties are" over "we're not entirely sure so we're going to say nothing". And I trust OWID can do an excellent job at communicating the uncertainties and interpretational difficulties involved.

Third, maybe a page on space. For instance, number of launches over time, number of satellites in orbit over time, amount and size distribution on space debris, space debris incidents over time, cost of launching a kilogram into orbit over time. In particular, both the UN and the ESA have really detailed datasets for potential launches over time / objects in space graphs, but the UN one doesn't have have an API so needs to be scraped, and I haven't seen people present either datasets in an accessible way.

EdMathieu @ 2021-12-15T13:02 (+4)

Update: we now have a couple of charts on outer space objects

  • Cumulative number of objects launched into outer space (line, bar, map)
  • Yearly number of objects launched into outer space (line, bar)
Jackson Wagner @ 2021-11-04T17:37 (+11)

It would be nice to have more datasets that try to go way back before 1800 -- like those used by this OpenPhil report, or books like "Why The West Rules", "Secular Cycles", etc. Here is a link to a pdf will all the figures in "Why The West Rules", albeit they are mostly maps. I like graphs 3.1, 3.7, and 9.3.

As a stretch goal, once you have a bunch of super-long-ago data, it would be sweet to be able to graph the data not just linearly in time, but also along various warped scales so that instead of equal intervals representing equal years, equal intervals represent:

Nathan Young @ 2021-11-04T15:50 (+9)

Size of different communities. 

I think it's underrated how much identification affects decisionmaking. I'd love some graphs of the change of people who identify with different monickers over time.

eg:
- socialist, weeb, christian, protestant, goth, EA etc etc etc

Michael Hinge @ 2021-11-05T12:45 (+8)

Hi Ed!

One thing that falls potentially into all three categories of difficulty is food stocks/reserves, which is an issue with high relevance to exposure to shocks and food insecurity, but really hard to track. 

It's a tricky issue, but could really help many researchers inside and outside of EA to improve!

A few issues we have found which would be very useful to see developed are:

The USDA PSD and FAOSTAT both have estimates for crop year end, but as crop years do not line up effective stocks are higher than this figure. These results are based on a few methodologies, but do not match reality exactly, and are better for globally traded crops. 

Reconcilliation and improving on these estimates is possible, but requires detailed trade data or insider data, which is very commercially sensitive often. Big traders (ABCD companies/COFCO) would know this, but they do not disclose.

Stocks can be reasonably accurate at a global level and when averaged over a period of time, however fluctuations in demand, smuggling and delays in data releases mean they can be hard to track on a country by country basis for the poorest. These are often the countries we care most about for food insecurity.

Stocks in strategic reserves, private reserves and simply in transit can be difficult to divide out. In some cases this suggests chunks of available stocks are missing, or that stocks would not be available to the market if classed as "private" when actually state controlled.

Nathan Young @ 2021-11-05T11:07 (+6)

How much negative carbon (carbon offsets) are being produced at different accreditation standards. IE carbon offsets always have to be accredited so it's not just a question of how many, but how much different orgs count them as.

Maybe just take Carbon Plan data and put it in an easier to read form.

evelynciara @ 2021-11-05T11:33 (+4)

Can you clarify what this means, for the benefit of all forum readers? I figured it has something to do with carbon offsets.

Stenemo @ 2021-11-25T16:34 (+5)

I think it would be very beneficial to take advantage of a complementary software like Anki.

Jide @ 2021-11-07T16:59 (+4)

I'm wondering if it would be useful to track data on national legislatures (or maybe just heads of state) worldwide? This could include:

I'm not sure how feasible this is, but I imagine it could help EAs think more concretely about where they're likely to find support for different advocacy efforts.

Isaac King @ 2021-11-11T02:23 (+3)

Better tools for simple comparisons of different datasets and generating custom charts. For example, there have been a number of times when I wanted per-capita data but could only find charts for total, or vice versa. (This should be a low-priority request since it's primarily a convenience issue.)

MichaelA @ 2021-11-05T09:47 (+3)

I previously tried to think of "active grantmaking" ideas (here), i.e. things I might want EA funders to fund but for which no application has yet been made. Some of these were people/orgs I thought do cool work and so might be worth funding for something, and some of these were potentially cool ideas I might want some person/org to do. 

Here's one of the (rough and vague!) ideas I had:

Fund OWID for something

MichaelA @ 2021-11-05T09:51 (+2)

Another idea I had was funding the creation of "new things kind-of like Our World in Data". (This is discussed briefly in this comment thread.) 

I think the key bottlenecks to this are (1) a clearer sense of precisely what kind of org/project we'd want and (2) people who are willing and suited to making that happen.

I guess OWID could help facilitate that by suggesting types of OWID-like projects that would be great but for whatever reason might be better done elsewhere instead/as well, suggesting people who might be great at doing that, and/or agreeing to provide some advice/mentorship to whoever sets up this other thing.

HaukeHillebrandt @ 2021-12-08T11:39 (+2)

Automated Local Regression Discontinuity Design Discovery

Automated discovery of outliers in multicountry datasets (i.e. where you can see where your country is 3 SDs away from the mean).

simeon_c @ 2021-11-06T08:03 (+2)

Maybe a page on AI capabilities and progress could be useful to explain to people why there are chances that very powerful AI appear this century m ? For instance one graph that I'd love to see would be when we expected a breakthrough and when it actually happened, things on the scaling of models and the scaling of performances, the evolution of the use of AI in industry, the evolution of investment in AI, the evolution of the distribution of funds or researchers between academia and private sector etc. I'm not sure that all of that is tractable but that would be great !

evelynciara @ 2021-11-05T10:51 (+2)

Data on trends in the influence of geopolitical "great powers," possibly going back to the beginning of civilization.

Pat Myron @ 2024-02-15T19:02 (+1)

Ray Dalio has attempted to quantity and visualize similar: https://www.linkedin.com/pulse/big-cycles-over-last-500-years-ray-dalio

evelynciara @ 2021-11-05T10:03 (+2)

Collaborate or merge with New Things Under the Sun, a website that compiles social science research on innovation.

I would be especially interested in pieces on:

Yitz @ 2022-05-29T17:12 (+1)

I'm not sure if you're still actively monitoring this post, but the Wikipedia page on the Lead-crime hypothesis (https://en.wikipedia.org/wiki/Lead%E2%80%93crime_hypothesis) could badly use some infographics!! My favorite graph on the subject is this one (from https://news.sky.com/story/violent-crime-linked-to-levels-of-lead-in-air-10458451; I like it because it shows this isn't just localized to one area), but I'm pretty sure it's under copyright unfortunately.

JoelMcGuire @ 2021-11-13T20:34 (+1)

Hi Ed,

Here are some imaginary fruit:

1. At the Happier Lives Institute we would be very interested to see something like a global burden of disease except for suffering. What are the largest sources of unhappiness across the world? 

2. OWID summarized the results of studies on important topics. That is, it collected and visualized meta-analytic information for important topics from databases like AidGrade or MetaPsy