What “pivotal” and useful research ... would you like to see assessed? (Bounty for suggestions)

By david_reinstein @ 2022-04-28T15:49 (+37)

Update 28 Apr 2022 -- Only ~10 responses so far, many of which are general areas rather than specific papers/findings/projects. So the 'bounty expected return' is still high.

Below, I give some work that I thought might be especially relevant, as examples.

Update 22 Apr 2022: Bounty^[1]

Update 5 Jul 2022: The bounty prizes have been awarded. Ross Tieman (ALLFED) made a suggestion we are piloting ($250 prize), and 3 other people made eligible suggestions; we drew one randomly, awarding $250.

Please continue to make suggestions though; still very valuable, and we aim to award retroactive prizes too.

What papers, findings, projects, or pieces of research (academic or non-academic) would you most like to see carefully and rigorously evaluated?^[2]

Which specific results do you rely on in making key decisions, or which ones do you think that large EA-aligned donors and organizations rely on heavily?

I'm considering this particularly as...

Demonstration and test-cases for the Unjournal in our first months, as well as our general agenda (see the previous request and my earlier post

We have put a bounty on this, explained below

Research projects the Unjournal should consider in its first year

I plan to offer a bounty on successful suggestions for this, and entries will also be eligible for future bounties.

Also: Great use cases for the https://redteammarket.com/ (tied to the eminent Daniel Lakens); this initiative looks very promising to me. ^[But please note that the bounty will only relate to the Unjournal.)

I'm looking especially for:

specific papers and findings in these papers and projects that we lean on a lot
empirical work that would benefit from an open-science/replicability assessment or an assessment of the methodology

The Bounty

Full details, T&C, and other considerations are here; excerpted below.

The prizes are:

1. “Piloted suggestion prize”: $250 x 1-3 … prizes for each of the 1-3 suggested research projects that we choose as piloting/proof-of-concept examples

2. “Participation prize”: A $150-$300 prize (see below)

Drawn randomly among ...

All people (other than those winning prize 1), who submit suggestions, where these suggestions are in a format we can consider. They must link a piece of research or a project and giving at least 1 sentence justifying it’s relevance.
Anyone who sends a posted letter (see footnote)^[3]

If we use two or more suggestions for piloting/proof-of-concept the Participation prize will be $150. If we use only one of the suggestions for piloting/proof-of-concept the Participation prize will be $250. If we do not use any of the suggestions for piloting/proof-of-concept the Participation prize will be $300.

3. (Potential) Additional prize qualification: We intend to have a general bounty for suggesting projects and papers that are assessed through the unjournal. We will decide this after/during the unjournal piloting process. Even if your suggestion is not chosen for the piloting example, if we later choose to have it assessed through the Unjournal, you will be eligible for any associated bounty.

Timeline for bounty

We announced this bounty on 22 April 2022. We intend to choose the pilot projects within one month. I will impose a ‘final resolution’ date of July 4, 2022, at the latest (but I hope to resolve this earlier). As noted, the ‘additional prize’ bounties may carry forward after this date.

Responding

You can:

respond below,
DM me,
or in THIS Airtable form.

The Airtable form is the most helpful way to respond, the best way to be sure you are getting at what we are asking, and the easiest way to be eligible for the prize, and to make sure we have your contact info.

However, we appreciate all forms of response. Even if you fill out the form, can be helpful to also post your suggestion below to foster conversations.

You can submit as many entries as you like.

Recognition and anonymity

We intend to publicly recognize all suggestions that we use unless you say you want to remain anonymous. If you want to remain anonymous even to us (the unjournal organizers), you can submit the Airtable form without leaving any contact information, but then you will not be eligible for the bounty prize (as we cannot contact you.)

Some suggested 'sort of things we might be looking for'

0. Eva Vivalt, 2020, "How Much Can We Generalize From Impact Evaluations?", JEEA

One click link

Why (not) this paper?

Firstly, I suspect this paper has been rather thoroughly assessed, and it is published in a respected journal.^[4] So I'm not saying this actl paper, at least not for our early stages, but papers and projects like this.

Why:

Impact evaluations drive GiveWell, Open Philanthropy, and government aid organizations' recommendations and actions in the global health space (and beyond). This is obviously core to EA.
This is a serious, methodologically rigorous, effortful, and well-documented quantitative assessment of how well the insights from these studies generalize.
- The meta-analytic methods used are also highly relevant for EA research organizations
The work is empirical, and all code and data is shared. But the journal publication formats do not make replication as easy as it should be.
The results themselves (e.g., figure 3, table 4, the regression tables) could also be better presented in a dynamic document, allowing users to filter, zoom in, choose what to look at, etc. (E.g., the Plotly tools, all the great stuff we see at our world in data).
Users/readers of this research also have a number of value-based and empirical judgments to make, and could derive all sorts of useful personalized recommendations. This is enabled by dynamic formats; in fact, this is what the author's organization/site "AidGrade" works to do.^[5]
A large part of the paper is essentially a reprise and tutorial on Bayesian meta-analysis and hierarchical models. In a dynamic format this could be presented in a much more 'teachable way', allowing expanding boxes, out-links and hover-overs, animations, etc.

1. Kremer et al, "Advance Market Commitments"

Kremer, M., Levin, J. and Snyder, C.M., 2020, May. Advance Market Commitments: Insights from Theory and Experience. In AEA Papers and Proceedings (Vol. 110, pp. 269-73). One-click link here

Why this paper?

Advance market commitments for vaccines and drugs seem highly relevant to both global health and development priorities, and to reducing catastrophic and/or existential risk from future pandemics. This is also a practicable policy.
The authors make specific empirical claims based in specific calculations that could be critically assessed^[6], as well as a specific formal economic model of maximization, with empirical implications.
The authors are well-respected in their field (obviously, this includes a Nobel laureate), but the paper may not have been as carefully reviewed and assessed as it could have been. "AEA Papers and Proceedings" does go through some selection and scrutiny but is not peer-reviewed in the same way that papers in journals like the American Economic Review are.
The authors stand strongly behind their work and are eager to promote its impact; e.g., see this NY Times Op-ed including many of the authors
The calibration model and some other parts of the explanation might be better suited to interactive and linked formats, rather than pdfs, to get the maximum value (but this is not necessary)

2. Aghion, P., Jones, B.F. and Jones, C.I., 2017. Artificial Intelligence and Economic Growth. NBER working paper.

one-click link

Why this paper?

The work seems (and is cited in David Rhys-Bernard's syllabus) as being relevant to longtermism. Longtermist EAs are particularly interested in economic growth but concerned about AI risk, so this may weigh on an important tradeoff.
While the paper uses macroeconomic growth and production theory, as well as simulation and calibration (only some broad-brush real-world data in a later section ...
- it makes specific policy-relevant claims and uses standard tools of economics
- the setup, structure and implications and interpretations of these models could be carefully reviewed
The papers' authors are prominent (even being an NBER member is rather selective), but after about 5-6 years the paper is still not published in a peer-reviewed journal
- This need not be because of weaknesses in the research; the authors may have abandoned the process for career/strategic reasons, or it may be seen as 'not innovative enough' or 'not interesting enough' (whatever that means) by the reviewers in the top economics journals .
- ... that does not imply that the research is not relevant and interesting for EA; I think it is

Other examples: Animal welfare

Caviola et al; Humans First: Why people value animals less than humans

I've been emphasizing work in economics, perhaps because of my background, but work in psychology and other social sciences will also be relevant

Van Loo et al, 2020, Consumer preferences for farm-raised meat, lab-grown meat, and plant-based meat alternatives: Does information or brand matter?; Food Policy

Obvious relevance for animal-welfare interventions and charities
Empirical (national discrete choice experiment/survey). Jason Lusk's work is often cited and recommended.
Yes, it is published, but 'Food Policy' seems like a rather specific field journal, perhaps. This may not have been given the careful feedback and assessment it deserves, because it may have been seen as a niche issue by mainstream economists.

Other examples: Long-termism and existential risk

Denkenberger and Pearce; Cost-Effectiveness of Interventions for Alternate Food to Address Agricultural Catastrophes Globally.

AllFed is one of the most concrete intervention and organization associated with extinction risk and longtermism.
The work is based on Monte-Carlo Fermi estimation, I believe. The authors are engineers; this could probably use feedback from an economist, policy analyst, or business academic.

Grace et al; When will AI exceed human performance? Evidence from AI experts

I don't want the Unjournal to take on technical AI issues yet, but this is about 'aggregating uncertainty' from experts on a crucial issue, and it seems in the quant/econ/social science wheelhouse.

Other examples: Global Health and development

General equilibrium effects of cash transfers: experimental evidence from Kenya -- suggested by a forum reader.

Other examples: Improving institutions and public goods provision

Liberal radicalism: a flexible design for philanthropic matching funds - Vitalik Buterin Zoë Hitzig and E. Glen Weyl

link

Why: Quadratic voting and schemes like these are often cited/advocated as game-changers, including in EA and rationalist circles, I think. The authors are fairly prominent, ~~but the paper doesn't seem to have 'made it through peer review'. Maybe it is seen too much as advocacy?~~^[7]

Update/correction: The paper was published in Management Science under another title

So does it make sense, what are counter-arguments and vulnerabilities, has/does it work in practice?

Update: The authors reached out to me. Gitcoin has done some experimentation on this mechanism, and there may be future work to consider and evaluate.

explained below, with full details and T&C in this Google Doc ↩︎
Especially considering empirical, quantitative, and applied work in economics, social science and impact evaluation. ↩︎
mentioning the “Unjounal prize”, send the letter to Rethink Priorities, 530 Divisadero St. PMB #796, San Francisco, California 94117, USA ↩︎
Although it is such a big question and ambitious topic, more feedback, assessment and public debate would be very helpful. ↩︎
But the interactive site and tool cannot be 'peer reviewed' in a standard way, and can't easily be given career rewards ... that's part of where the Unjournal comes in, hopefully. ↩︎
"Appendix A provides details behind calculations showing that if PCV coverage in GAVI countries converged to the global rate at the slower rate of the rotavirus vaccine in Figure 2, 67 million fewer children under age 1 would have been immunized, amounting to a loss of over 12 million DALYs. ↩︎
I remember being told as a PhD student something like 'Economists analyze markets and behavior, we don't propose policies' ↩︎

brb243 @ 2022-04-27T23:22 (+8)

1) Crucial consideration: nuclear warfare existential risk. Is it that if all nuclear warheads were piled up and detonated at once, the global temperature would decrease by a few degrees centigrade for a few years? Did much larger volcano explosions before the Year without summer cause an 8x increase in oats prices in the US (still, crops grew)?

Added: temperature change: FHI cited paper, general impact: A Model for the Impacts of Nuclear War (also cited by FHI) (GCR Institute authors) (also cites Robock) - does not quantify

2) Fallout shelter training cost-effectiveness (e. g. implementing this into a curriculum) - how does it compare to the cost-effectiveness of other nuclear risk mitigation programs? Is it that training (especially jointly developed and ran by nuclear weapons states' civil actors) would reduce the thrill of using nuclear weapons that less considerate leaders could otherwise experience and normalize the 'shamefulness' of proliferation (especially if this is another aspect of the 'class training')?

Added: difficult to find academic paper on fallout shelter training by using existing buildings (literature (1, 2) on building special shelters which may be highly cost-ineffective compared to the training)

3) Is the Fistula Foundation recommended by The Life You Can Save cost-ineffective compared to prevention? Something similar to antiretroviral therapy vs. educating high-risk groups in terms of cost-effectiveness (I estimate 35x). Should EAs maybe provide funding for one of the radio for healthy behavior charities develop and run the show?

Added: Obstetric Fistula: A Preventable Tragedy - does not compare cost-effectiveness of various prevention and treatment interventions but specifies the options of which costs can be estimated

4) GiveWell keeps giving grants to the organization that prevents pesticide suicide by banning pesticides, which may lead to lower yields of some farmers, who may then be depressed because their families go hungry - this is hinted on the NGO website. Can someone look into this so that GiveWell does not advance suffering (this can be intense suffering considering that in a Kenyan slum, more than a quarter of respondents seek to live 0 additional years while no one is committing a suicide)?

Added: difficult to find literature on the impacts of pesticide restriction on willingness to suicide/subjective wellbeing but relatively ample literature studies the impact of bans on suicide rates. Maybe EA India can provide expert insight?

5) Can we do global systemic change by giving every 12th person no new net (maybe they can repair or share)? This also concerns curricula but in emerging economies - if students are taught how to be healthy happy and share this by developing useful skills and professionalism (limited emotional appeal due to the perception of scarcity and inability to increase efficiencies) - rather than how to serve elites of formerly colonial governments, this can make a large part of the world much better (while the alternative is much worse, considering the poor meat eater problem and the possible impact of aggressive advertising on trust-based societies). Should malaria funders purchase some percentage less nets but inform people about various health, work, family, safety, and other tips that were developed by persons with satisfied basic needs and cooperative (with others close to them) norms?

Added: I have not seen an organization that works in this area - there is TaRL that enables students to catch up with the post-colonial curricula and some organizations that focus on informing people about a single or a few aspects of the intended virtuous cycle (example, 2, 3, 4), some perspectives on improving curricula for better industrial competitiveness, some self-help resources perhaps mostly relevant to highly affluent individuals, but nothing on enabling poor persons to improve their wellbeing through gaining the information they need - testing pamphlets under bednet packaging, comprehensive radio show series, etc.

6) Some people in Africa who came across EA were not stoked to join because of the price of RCTs. Getting a 99% discount shout be aspirational but not unachievable. Can there be some organization that would run these studies, ideally also making sure that smart locally-informed solutions to any complex problems that impact various sub-systems so that societies' wellbeing is improved are developed, for example through summarizing the $1/hour enumerator insights?

Added: Here is an example of an RCT cost - you can speak with Cameron King or other EAs with experience in emerging economies regarding the possible enumeration and data management/categorization costs and Kaleem Ahmid or anyone from J-PAL/IPA regarding the non-enumeration costs and the possibility of their reduction (making a semi-automated form, such as this sample size calculator, training professors in emerging economies, ...).

7) Is it that if you give extremely poor people money, nothing changes for them in the medium-term, maybe they get a corrugated metal roof, some food, livestock which keeps suffering (or I think one study found 56% of transfers spent on 'social' - throw a larger wedding party which means maybe kill three more goats?), pay school fees early, but spousal cooperation does not improve, spending on education after transfer increases by $1/month, people do not start aspiring to upskill to take great care of their families and be a great joy to be with. Furthermore, because of relatively poor institutions in extremely poor areas, people can steal, bully, or attack without consequences. A few risky persons can depress the mood of entire village. And, GiveDirectly so far served only 0.06% of the globally poor (the remaining 99.94% of extremely poor, such as persons without income, are getting funding from other sources, such as family or neighbors). So, GiveDirectly is not solving problems. However, this funding could be used cost-effectively if it was given to at-risk groups (who could otherwise harm their communities) to run thriving programs for their communities - then, entire village would enjoy some support rather than living in fear for a fraction of the cost, maybe instead of 1,000 people funding 5 (200x lower) - then, 12% of extremely poor communities could have already achieved some basic wellbeing). Is this accurate (further detail)?

Added: this is about interpreting cash transfers research (e. g. in GiveDirectly website) against local preferences and wellbeing impact on stakeholders across species and times and its counterfactuals. Possibly, there cannot be a model so complex so human brain or a system of these needs to be employed. Feel free to ask Joe Huston of GiveDirectly who I have some of the research from as well as beneficiaries (communicate e. g. with EA Nairobi).

8) Founders Pledge knows that it is hurting animals and is doing nothing about it. It recommends Bandhan's Targeting the Hardcore Poor program that can transfer either non-sentient assets or livestock (the non-sentient assets even perform better in poverty alleviation) alongside with training to widows in India to graduate their families out of poverty. Is this accurate?

Added: this paper pp. 9-10 regarding the relative impact of non-sentient and livestock transfers.

9) The reason why deworming is so cost-effective is because GlaxoSmithKline is providing the pills for free. But, deworming seems not to work quite so well, which has been known by GiveWell since 2013. Can it be that manufacturers in nations where worm infections can occur gain investments and training to make these drugs, manufacture other medication for profit or global health reasons, and GSK is free to research solutions to the other 19/20 neglected tropical diseases? Should donors be informed of the reasons for worm infections treatment cost-effectiveness (and its limitations), including the counterfactuals, such as GSK research?

Added: feel free to contact Dr. Harrison of the SCI Foundation regarding the understanding of the complex stakeholder relationships or papers on neglected tropical disease research cost-effectiveness estimates. Caroline Fiennes critiques the programs. The Cochrane study regarding limited impact of deworming may be central.

10) The pneumococcus vaccine saves a life for $1-20? Can someone review this opportunity?

Added: The Disease Control Priorities research (Figure 17.2, page 323).

11) Restoring eyesight at no cost is possible and a better alternative to the Fred Hollows Foundation recommended by The Life You Can Save?

Added: Aravind Eye Care uses sliding scheme (p. 51).to treat about a half of its patients for free and operates only in 5 locations in 2 states in India. Fred Hollows does not operate in India so it is not targeting patients not captured by the market. I am not sure about any free government clinics in India and what the government would have spent their funding on if it did not have to subsidize eye operations or about any Aravind Competitors and their pricing scheme and coverage.

12) OpenAI is not actually safe, more like helping the most privileged companies sell their products by highly skillfully intrusive advertising, which makes humans suffer a desired dystopia (they do not want the product but they get it while feeling shameful/fearful/other negative emotions - but narrate positive emotions - it is the skill of the ad). Is this true?

13) How is it with the ethics at DeepMind?

14) Can someone look into the impact of Holden Karnofsky's writing - does it limit people's critical thinking ability and make them act impulsively based on negative emotion, which can be suboptimal in EA?

Added: Cold Takes - emotional analysis of someone uninfluenced by the normalization of negative-emotions advertisement impulsive behavior may be needed

15) Institutional change and longtermism: can it be that when sound institutions based on trying to get great experiences for others and having fun with similarly skilled individuals are developed then there is little risk and wellbeing increases? This can somewhat unite everyone in EA: some care (prioritize in specialization) about humanity's survival, some work on implementing animal welfare, and some aspire to increase human welfare.

Added: FHI research areas - nothing on improving institutions to be generally competently caring. This is another example of a critique of an omission.

16) Do people really buy bednets after they get the first one or not really (it seems that AMF keeps buying bednets). Also, does about 75% of recipients use the nets? Would it be possible to increase this, for example by implementing the importance of buying and using a net as a part of a school curriculum?

Added: the linked Summary of AMF PDM results and methods [2020] (public)

17) Crucial consideration: Will EA cause a biocatastrophe because it is sharing material about how stoked people should be about biorisk - so much harm potential, becoming extremely affordable fast, and no one is prepared! Are there any ways to keep investing into preparedness programs (e. g. as a part of defense budget), reduce accidental leaks possibility without notifying underresourced rebel groups in stressful relatively unempathetic globally poor environments about opportunities of developing weapons (e. g. we have some very boring investments into global health and healthcare equipment which we are sharing alongside with bandages and solar fridges that we particularly think are awesome), keep hiring and upskilling great people (e. g. through military academies), sign agreements almost as a formality (this can be more challenging but should be a no-brainer), not involving the general public that may be like whoa I'm going to research something in a garage get a pandemic potential pathogen become the boss through the threat potential, so that defenses and prevention frameworks are developed without or before selfish actors learn about these issues? Should the EA biosecurity Instagram that Max Dalton seems to be enthusiastic about be taken down? Should EA stop mentioning biorisk, at least in relation to wow super stoked normalizing that threat is imminent and anyone can basically do it?

Added: I think The Precipice mentioned something on increased enthusiasm regarding biorisk risk but I have not seen any study on this, especially from the context of underresourced terrorist groups. For illustration, feel free to review this article on limited consideration and concrete demands of fighters in northeastern Nigeria and speak with someone from EA in similar poor or extremely poor context including in Africa or the Middle East regarding risks and effective mitigation strategies and with Jason Matheny or John Fogle regarding the possibilities of state military influence toward biological proliferation and non-state actors increase of interest in own research in the context of the US.

18) Is StrongMinds inappropriate and causing dystopias one cannot escape from? It may be that in industrialized nations, persons who suffer from limited competitiveness in terms of establishing dominance through attracting attention of persons who are not interested in interacting in comparison with ads benefit from speaking about their issues in groups (reduced loneliness - one with ads - and increased human interaction in an environment no one really needs anything besides emotional support). In globally poor contexts, however, persons may be suffering from abuse and inability to fulfill their and their close ones' basic needs and seeing no realistic way to escape this cycle, even with other generations: talking about these concerns with others in a similar situation - can contribute to the feeling of hatred and betrayal - others are not helping and have been pretending to be ok but are actually suffering and not doing anything - this is depressing - metrics can be set up in a way that finds mental health improvement but that can be experimenter bias of persons not interested in thinking about their wellbeing and not indicative of people's subjective experiences. Can some impartial local critical thinker or a focus group be in-depth interviewed about the impact of the program and alternative solutions? Also, would it make sense to treat the causes rather than symptoms (e. g. improving spousal cooperation and reducing hunger, e. g. by improved farming practices)?

Added: here is a summary on StrongMinds website, here is the possibly most recent report regarding their impact evaluation. This is a metrics interpretation and appropriateness study - do people think 'depression free' as in 'we do what the group facilitators tell us' but they do not connect with their subjective feeling in either case because of the possibly highly inconsiderate environment where children are unwanted? Is it more appropriate to measure willingness to be born if can choose or subjective wellbeing in a way that motivates people to actually connect with their feelings during their usual activities?

19) Would we want to outdate ITN and work with Bayesian/expert intuition updating of the average impact cost with optimal spending (considering funded and unfunded unit impact cost developments - funded can be research of a global health vaccine and unfunded can be a window of opportunity during a political campaign animal welfare momentum) of resources over all impact areas? (See Institutional impact for more detail.)

Added: This post talks about momentary cost-effectiveness while my post seeks to visualize the average cost-effectiveness considering its development over one area and states that all areas should be considered but does not mention the possibility of external changes of cost-effectiveness but mentions updating.

20) Does the Introductory EA Program carry the legacy let's convince hedge funders to give money to poor people because there is also data? Should it rather presume that people already care about solving important problems so do not need to be convinced to start thinking about it but jumping to what to do is more appropriate?

Added: I have not seen this specific critique in EA regarding how to focus on solving problems but people talk about elitism (counterargument) and diversity (maybe people who know how to solve problems are excluded). Maybe ask someone from outside of EA regarding if EA comes across as people stoked about solving problems or more like outdebating you in theoretical impact, for example the person who wrote this critique on The Precipice, because if you ask someone in EA and the problem is present you do not advance much unless your questions are very concrete, based on understanding of the problem.

david_reinstein @ 2022-04-28T00:16 (+8)

Thank you. This is a useful list. Some of these directly link academic work/ work that claims finding rigorous empirical results. In other cases I will have to dig into these to find 'what is the paper being cited, if any', which I will try to do.

brb243 @ 2022-04-28T17:26 (+3)

Thanks! I also added some more links. Some are issues of omission, analysis, or interpretation so may be especially challenging to spot and rationalize.

Denkenberger @ 2022-05-01T01:09 (+4)

Added: temperature change: FHI cited paper, general impact: A Model for the Impacts of Nuclear War (also cited by FHI) (GCR Institute authors) (also cites Robock) - does not quantify

It looks like you mean FLI, not FHI.

MichaelDickens @ 2022-04-20T19:08 (+6)

The link to "Unjournal" is broken, it goes to https://forum.effectivealtruism.org/posts/kftzYdmZf4nj2ExN7/bit.ly/eaunjournal instead of bit.ly/eaunjournal.

david_reinstein @ 2022-04-20T19:25 (+4)

thanks, fixing now ... I've made that mistake before in the forum

Michael_Wiebe @ 2022-06-13T05:34 (+4)

Air pollution literature, relevant to OpenPhil.
literature on psychotherapy (vs cash transfers)

Charles He @ 2022-04-20T19:37 (+4)

EDIT: Contrary to the text below, the study is not an RCT. Instead there is a GiveWell funded study collecting and exploring trend data for LLIN (malaria nets). https://www.givewell.org/research/incubation-grants/Malaria-Consortium-monitoring-Ondo-July-2021#Risks_and_reservations

An important upcoming study that I read about is a robustness study or RCT on the effectiveness of AMF or distributions of malaria nets.

(I think I read about this study in a post here. I am typing in mobile and can’t immediately find/post a link).

This RCT or result would be really important to EAs and maybe EA in general. Many people have donated a lot to AMF. The act of donating, the belief in AMF, as well as the method and process used, is part of EA identity.

Maybe EAs should not be worried about finding the underlying truth as a result of the study, but maybe there should be worry or attention to clunky or simplistic presentation or socialization of the results.

For example, there are many reasons to see superficially negative results (that suggest cost inefficiency or ineffectiveness of nets) but that doesn’t reflect a reality that that undermines AMF. For example, just unmeasured rainfall, or noise from any number of sources, could cause an issue in seeing an effect (issues with statistical inference or identification)

I think explaining causal inference beforehand (as opposed to after) might be useful.

A reasonable guess is that an RCT might only be 50% reliable (including claiming to be reliable but having a hidden defect). The chance of some negative event from this study (other than the consequence of showing the truth) might be ~5-15%.

david_reinstein @ 2022-04-20T19:57 (+4)

This seems very relevant; if you find the name/link, please do add it in the Airtable or just here in the comments. Thanks.

Charles He @ 2022-04-20T20:10 (+4)

Ok I think I found my “source”: https://www.givewell.org/research/incubation-grants/Malaria-Consortium-monitoring-Ondo-July-2021

It seems valuable, but it doesn’t seem to be an RCT. I can’t immediately tell what it is but it looks like collecting trend data without a control group. (To onlookers, I know that sounds frowned upon but it’s a real thing and probably judging the value of the design requires great domain knowledge.)

So it looks a lot less pivotal than an “RCT on AMF”.

So my original answer above might have been misleading.

david_reinstein @ 2022-07-01T08:50 (+2)

FYI, the bounty resolves in 3 days.

Submit ~today or tomorrow for your last chance to suggest research & be eligible for participation/piloted suggestion prizes.

Submit suggestions at bit.ly/pp_unjournal

PeterSlattery @ 2022-04-26T00:00 (+2)

A quick response that I may build on later. I only scanned your write up of plan so sorry if I missed something there.

I think that inviting submissions from research in preprints or written up in EA forum posts is a good idea.

A submission would simply be giving you permission to publish/host the original document and reviews in the unjournal. Post review, authors could have the option to provide a link to their revised output, or to add a comment.

As you know, newsletters such as the EA Behavioral Science Newsletter (https://preview.mailerlite.com/m9i6r0j7h9) curate some options, so this could be an easy place to start.

I know this is your idea so maybe you win your own bounty if you like it!?

Would RP not have many research outputs that could be included? READI might also have some upcoming work on institutional decision-making or moral circle expansion that could be considered (though I'd need to talk with the team, etc).

Aside from those, reviewing relevant old but influential reports and top forum posts, could also be valuable.

One that comes to mind now is How valuable is movement growth?

At least for me, The Awareness/Inclination Model (AIM) in this seemed, for a while, to be a popular theory influencing how EA people thought about movement building.

A review of it this would help with understanding how confident we should be (or have been) in the empirical data presented for these arguments made, and might also throw up some ideas to build on it or test it.

Finally, I think that treating the first few rounds of doing this as being an experiment is probably a good idea. It might be the case that only a certain type of paper/output works, or that reviews are more useful than you imagined even for relatively low level research. Probably hard to tell until you do a few rounds of the process.

david_reinstein @ 2022-04-26T00:18 (+2)

I think that inviting submissions from research in preprints or written up in EA forum posts is a good idea.

Definitely the former, but which ones? As to EA forum posts, I guess they mainly (with some exceptions) don't have the sort of rigor that would permit the kind of review we want... And that would help unjournal evaluations become a replacement for academic journal publications?

A submission would simply be giving you permission to publish/host the original document and reviews in the unjournal. Post review, authors could have the option to provide a link to their revised output, or to add a comment.

Actually, I don't propose to host or publish anything. Just linking a page with a DOI ... and review it based on this, no?

As you know, newsletters such as the EA Behavioral Science Newsletter (https://preview.mailerlite.com/m9i6r0j7h9) curate some options, so this could be an easy place to start.

I think I should go through this carefully, for sure.

One that comes to mind now is "How valuable is movement growth?"

I think reviewing EA forum posts is very valuable, but this is a separate thing from what I'm trying to do with Unjournal. If we include this in the 'same batch of things' it would probably drive away the academics and very serious researchers, no?

At least for me, The Awareness/Inclination Model (AIM) in this seemed, for a while, to be a popular theory influencing how EA people thought about movement building.

A review of it this would help with understanding how confident we should be (or have been) in the empirical data presented for these arguments made, and might also throw up some ideas to build on it or test it.

Again, I don't think it has been written up in a way that is aiming at rigorous peer review, is it?

Finally, I think that treating the first few rounds of doing this as being an experiment is probably a good idea. It might be the case that only a certain type of paper/output works, or that reviews are more useful than you imagined even for relatively low level research. Probably hard to tell until you do a few rounds of the process.

I think you might be right. I should dive in and be less delicate with this. Partial excuse for slowness so far: I'm waiting for a grant to come through that should give me some more hours to work on this (fingers crossed)

Thanks for the responses!

PeterSlattery @ 2022-04-29T06:20 (+4)

See >PS>

I think that inviting submissions from research in preprints or written up in EA forum posts is a good idea.

Definitely the former, but which ones?

>PS> Yeah, the only easy options I can suggest now are to consider some of items in the BS newsletter

As to EA forum posts, I guess they mainly (with some exceptions) don't have the sort of rigor that would permit the kind of review we want... And that would help unjournal evaluations become a replacement for academic journal publications?

>PS> This is probably a bigger discussion, but this makes me realise that that one difference between us is that I probably want the unjournal (and social science in general) to accept a lower level of rigor than most journal (perhaps somewhere between a very detailed forum/blog post and a short journal or conference article).

One reason is that I personally think that most social science journal articles sacrifice too much speed for better quality, given heterogeneity etc. I'd prefer maybe 10x as much research, at .1x the quality. To be clear, I am keen on keeping the key parts (e.g., a good method and explanation of theory and findings), but not having so much of the fluff (e.g., summarising much prior or future potential research etc).

A second is that I expect a lot more submissions near the level of conference work or a detailed forum post level, than will be journal level. There are probably 100x more forum posts and reports produced than journal articles. Additionally, there is a lot of competition for journal level submissions. If you expect an article to get accepted at a journal then you will probably submit it to one. On the other hand, if you wrote up a report pretty close to journal level in some regards and have nowhere to put it, or no patience with the demand of a journal or uncertainty, then the unjournal is relatively attractive given the lack of alternatives.

A submission would simply be giving you permission to publish/host the original document and reviews in the unjournal. Post review, authors could have the option to provide a link to their revised output, or to add a comment.

Actually, I don't propose to host or publish anything. Just linking a page with a DOI ... and review it based on this, no?

>PS> Yeah, sounds good.

As you know, newsletters such as the EA Behavioral Science Newsletter (https://preview.mailerlite.com/m9i6r0j7h9) curate some options, so this could be an easy place to start.

I think I should go through this carefully, for sure.

One that comes to mind now is "How valuable is movement growth?"

>PS> Yeah, so that's a good point. I think that it gets into the points above. Perhaps you can have different types of submissions (e.g., work in progress, opinion etc?) .You could treat it like some other journals have and scale up expectations over time once it starts getting known?

At least for me, The Awareness/Inclination Model (AIM) in this seemed, for a while, to be a popular theory influencing how EA people thought about movement building.

A review of it this would help with understanding how confident we should be (or have been) in the empirical data presented for these arguments made, and might also throw up some ideas to build on it or test it.

Again, I don't think it has been written up in a way that is aiming at rigorous peer review, is it?

>PS> Perhaps not. Maybe that's something that authors need to answer. Regardless, I think that there would be a lot of value in these sort of reports getting peer reviewed by academics/experts, especially where they are influential in the EA community.

Finally, I think that treating the first few rounds of doing this as being an experiment is probably a good idea. It might be the case that only a certain type of paper/output works, or that reviews are more useful than you imagined even for relatively low level research. Probably hard to tell until you do a few rounds of the process.

>PS> I think you are doing a good job and I am not sure I am giving good advice! However, it could be the case that you want to use this process to test some assumptions and processes (e.g., about how many people will submit, what sorts of articles you will get, how long things will take, how best to show outputs) etc.

Thanks for your work on it!

david_reinstein @ 2022-04-29T14:20 (+4)

I agree on most of your counts.

Regardless, I think that there would be a lot of value in these sort of reports getting peer reviewed by academics/experts, especially where they are influential in the EA community.

I agree, but I don't think this is what the Unjournal should handle right now. It should be done, but maybe with a different vehicle and approach.

I'd prefer maybe 10x as much research, at .1x the quality.

I tend to disagree with this. My concern is that most/much research is often 'vomited out' to satisfy tenure requirements and other needs to "publish something". There is just so much research and writing out their to wade through.

I typically question...

Are the empirical results trustworthy? Can we have confidence in their validity and generalizability ... and the interpretation the authors give?
Is anyone reading these papers (and if so, are they understanding them or just skimming and getting the wrong impressions)?
Are these being combined with other work, and with replication, to give a general picture of 'what we know and with what confidence'? Is it entering into a general building of our knowledge and ability to learn more and use the work?

PeterSlattery @ 2022-05-11T05:50 (+4)

Thanks for replying.

When I say I'd prefer maybe 10x as much research, at .1x the quality, I don't want to miss out on quality overall. Instead, I'd like more small scale incremental and iterative research, where the rigour and the length, increase in proportion to the expected ROI. For instance, this could involve a range of small studies that increase in quality as they show evidence, followed by a rigorous review and replication process.

I also think that the reason for a lot of the current research vomit is that we don't let people publish short and simple articles. I think that if you took most articles and pulled out their method, results and conclusion, you would give the reader about 95% of the value of the article in maybe 1/10th the space/words of the full article.

If a researcher just had to write these sections and a wrapper rather than plan and coordinate a whole document, they might produce and disseminate their insights in 2-5% of the time that it currently takes.

david_reinstein @ 2022-04-18T23:04 (+1)

Working on an approach here ... for some more context and ideas

david_reinstein @ 2022-04-18T21:20 (+1)

I think I should put a bounty on this question. Any suggestions for how to best implement this?

JP Addison @ 2022-04-18T23:06 (+3)

Write a bounty into the post and tag it with Bounty (Open).

david_reinstein @ 2022-04-20T00:28 (+1)

Thanks. Will do this soon. Already tagged/flagged it above, but waiting on a couple things to formalize the bounty.

david_reinstein @ 2022-04-15T16:09 (+1)

Note, I linked a form if you prefer to respond in that way, on a form.

The form asks

The most pivotal empirical pieces of research ... you would like to see red-teamed/assessed?

Give a quick name for this research, this finding, or the title of the paper/project

A URL where we can find the research, if possible

What is the finding/research and why is it important? How is it relevant to important decisionmaking? Why do you think it needs further assessment or red-teaming?

You can identify yourself and/or your background/expertise/experience if you want, but its not necessary

How confident are you that this research is worth assessing or red-teaming?