Announcing the EA Archive
By Aaron Bergman @ 2023-07-06T13:49 (+70)
About the EA Archive
The EA Archive is a project to preserve resources related to effective altruism in case of a sub-existential catastrophe such as nuclear war.
Its more specific, downstream motivating aim is to increase the likelihood that a movement akin to EA (i.e., one that may go by a different name and be essentially discontinuous with the current movement, but share the broad goal of using evidence and reason to do good) survives, reemerges, and/or flourishes without having to re-invent the wheel, so to speak.
It is a work in progress, and some of the subfolders at the referenced Google Drive are already slightly out of date.
Theory of Change
The theory of change is simple, if not very cheerful to describe: if copies of this information exist in many places around the world, on devices owned by many different people, it is more likely that at least one copy will remain accessible after, say, a war that kills most of the world's population.
Structure
As shown in the screenshot, there are three folders. The smallest one, "Main content," contains html, pdf, and other static, text-based files. It is by far the most important to download.
If for whatever reason space isn't an issue and you'd like to download the larger folders to, that would be great too.
I will post a shortform quick take (at least) when there's been a major enough revision to warrant me asking for people to download a new version.
How you can help
1) Download!
This project depends on people like you downloading and storing the Archive on a computer or flash drive that you personally have physical access to, especially if you live in any of the following:
- Southeast Asia + Pacific (esp. New Zealand)
- South and Central Africa
- Northern Europe (esp. Iceland)
- Latin America, Mexico City and south (esp. Ecuador, Colombia, and Argentina)
- Any very rural area, anywhere
If you live in one of the shaded areas (green or blue), I would love to buy you a flash drive to make this less annoying and/or enable you to store copies in multiple locations, so please get in touch via the Google Form, DM, or any other method.
2) Suggest/submit, and provide feedback
Currently, the limiting factor on the Archive's contents is my ability and willingness to find identify relevant resources and then scrape or download them (i.e., not the cost or feasibility of storage). If you notice something ought to be in there that isn't, please use this Google Form to do any of the following...
- Let me know what it is, broadly (good)
- Send me a list of urls containing the info (better)
- Send me a Google Drive link with the files you'd like added (best)
- Provide any general feedback or suggestions
I may have to be somewhat judicious about large video and audio files, but virtually any relevant and appropriate pdf/text/web/spreadsheet content should be fine.[1]
3) Share
Send this post to friends, especially other EAs who do not regularly use the Forum!
FAQ
How sure are you that this is at all necessary/helpful?
Not super sure! All things considered I think there's like a 70% chance I'd have done at least this much if I had done a lot more research.
Wasn't there a tweet?
Yeah, from an embarrassingly long time ago.
Wasn't there a website?
Yeah, but it proved more trouble than it was worth.
Contact info
- Email: aaronb50@gmail.com
- Twitter (DM): @AaronBergman18
- Also DM or comment here, on the Forum!
- ^
Fun fact: the plain text of all EA Forum posts takes up just about ~100MB (as of a few weeks ago), equivalent to about 2 hours of decent quality audio.
RedStateBlueState @ 2023-07-07T02:57 (+20)
This might be a dumb question, but shouldn't we be preserving more elementary resources to rebuild a flourishing society? Current EA is kind of only meaningful in a society with sufficient abundant resources to go into nonprofit work. It feels like there are bigger priorities in the case of sub-x-risk.
Aaron Bergman @ 2023-07-07T16:36 (+7)
I’ve definitely thought about this and short answer: depends on who “we” is.
A sort of made up particular case I was imagining is “New Zealand is fine, everywhere else totally destroyed” because I think it targets the general class of situation most in need of action (I can justify this on its own terms but I’ll leave it for now)
In that world, there’s a lot of information that doesn't get lost: everything stored in the laptops and servers/datacenters of New Zealand (although one big caveat and the reason I abandoned the website is that I lost confidence that info physically encoded in eg a cloud server in NZ would be de facto accessible without a lot of the internet’s infrastructure physically located elsewhere), everything in all its university libraries, etc.
That is a gigantic amount of info, and seems to pretty clearly satisfy the “general info to rebuild society” thing. FWIW I think this holds if only a medium size city were to remain intact, not certain if it’s say a single town in Northern Canada, probably not a tiny fishing village, but in the latter case it’s hard to know what a tractable intervention would be.
But what does get lost? Anything niche enough not to be downloaded on a random NZers computer or in a physical book in a library. Not everything I put in the archive, to be sure, but probably most of it.
Also, 21GB of the type of info I think you’re getting at is in the “non EA info for the post apocalypse folder” because why not! :)
Holly Morgan @ 2023-07-07T12:45 (+7)
That was my first thought, but I expect many other individuals/institutions have already made large efforts to preserve such info, whereas this is probably the only effort to preserve core EA ideas (at least in one place)? And it looks like the third folder - "Non-EA stuff for the post-apocalypse" - contains at least some of the elementary resources you have in mind here.
But yeah, I'm much more keen to preserve arguments for radical empathy, scout mindset, moral uncertainty etc. than, say, a write-up of the research behind HLI's charity recommendations. Maybe it would also be good to have an even small folder within "Main content (3GB)" with just the core ideas; the "EA Handbook" (39MB) sub-folder could perhaps serve such a purpose in the meantime.
Anyway, cool project! I've downloaded :)
RedStateBlueState @ 2023-07-07T13:27 (+4)
Yeah i guess that makes sense. But uh.... have other institutions actually made large efforts to preserve such info? Which institutions? Which info?
Holly Morgan @ 2023-07-07T15:42 (+9)
Huh, maybe not.
Might be worth buying a physical copy of The Knowledge too (I just have).
And if anyone's looking for a big project...
If we take catastrophic risks seriously and want humanity to recover from a devastating shock as far and fast as possible, producing such a guide before it’s too late might be one of the higher-impact projects someone could take on.
dsj @ 2023-07-09T16:59 (+9)
Another easy thing you can do, which I did several years ago, is download Kiwix onto your phone, which allows you to save offline versions of references such as Wikipedia, WikiHow, and way, way more. Then also buy a solar-powered or hand-crank USB charger (often built into disaster radios such as this one which I purchased).
For extra credit, store this data on an old phone you no longer use, and keep that and the disaster radio in a Faraday bag.
Holly Morgan @ 2023-07-10T20:10 (+3)
All done :-) (already had a solar/crank charger+radio). Thank you!
RomanHauksson @ 2023-07-06T23:06 (+9)
Can we set up a torrent link for this?
Aaron Bergman @ 2023-07-07T02:27 (+3)
I have only a vague idea what this means but yeah, whatever facilitates access/storage. Is there anything I should do?
RomanHauksson @ 2023-07-08T04:46 (+6)
I can look into how to set up a torrent link tomorrow and let you know how it goes!
RomanHauksson @ 2023-11-14T08:03 (+1)
Sorry, I never got around to this. If someone wants to take this up, feel free!
https://www.lesswrong.com/posts/bkfgTSHhm3mqxgTmw/loudly-give-up-don-t-quietly-fade
Larks @ 2023-07-06T14:35 (+8)
Interesting idea; what is the thought process behind the map?
Aaron Bergman @ 2023-07-06T14:44 (+7)
It’s actually been a little while since I made it, but places most likely to both (1) not be direct targets of a nuclear attack and (2) be uncorrelated with the fates of major datacenters plausibly holding the information currently
Yarrow Bouchard🔸 @ 2025-11-01T12:56 (+3)
I briefly touched on the topic of "doomsday archives" in a recent post. As far as I'm aware currently, the two most notable organizations working in this area are the Arctic World Archive, whose archive is located in Svalbard and whose headquarters is in mainland Norway, and the Arch (pronounced "ark") Mission Foundation based in the U.S. with archives in multiple locations, including on the Moon — really.
I'm personally interested in storage media that are long-lasting and resilient to major disasters and don't require a continuous electricity supply or continuous copying over to new discs or new hard drives. Paper is probably the best medium, all things considered. It's cheap, it lasts a long time, and anyone can read it without anything other than the paper itself, as long as they can read. The main disadvantage is storage density.
In the form of ebooks, you could keep 50 million books on fifty 20 TB hard drives in a closet for about $10,000. Storing 50 million paper books would cost many millions of dollars, both for the books and the building to keep them in. So, hard drives are unbelievably superior in terms of storage density. But they are incredibly short-lasting and non-resilient to major disasters.
New technologies are trying to combine the long-lastingness of paper with the storage density of hard drives. The Arctic World Archive uses piqlFilm, developed by its associated for-profit company, Piql, in Norway. piqlFilm involves printing QR codes on film, estimated to last 2,000 years under ideal conditions, along with normal text instructions for decoding the QR codes. (You can also just print text or black-and-white images on it, but that's much less information dense.) The Arch Mission Foundation uses Nanofiche, which is text and images engraved on nickel. It only requires magnification to read.
Microsoft is working on putting data into quartz glass. The research project is called Project Silica. The storage medium is a clear square piece of quartz glass about the size of a floppy disk or a glass coaster. Each one can hold 7 TB. The estimated longevity is over 10,000 years. However, it requires advanced technology to read the data. It isn't intended for a doomsday archive. Microsoft is exploring its use for cold storage for cloud data as a competitor to hard drive storage, on the thinking that rarely accessed data could be more cheaply stored on quartz glass than on idle hard drives.
Encoding data on dehydrated DNA is another idea that has received a lot of attention and is an active area of research. Dehydrated DNA can last a long time. DNA is also incredibly information dense. But it is ungodly expensive. It also requires advanced technology to read.
We are somewhere in between having no good options for storage and having the perfect, magical solution. We're somewhere in the middle. Paper is really excellent and might be the best option overall. piqlFilm is a very intriguing option, but it costs about $30 per GB and requires cameras and computers to interpret. I don't know the cost of Nanofiche, but I imagine it's expensive.
In a way, books and libraries are already fairly good doomsday archives. Particularly since they are everywhere. The main three ways paper could improved upon are:
- Storing data types other than text (with a rare black-and-white photo or diagram)
- Storing a lot more information in a small physical volume for a manageable cost, e.g. storing the equivalent of 50 million ebooks in a closet for $10,000
- Extending longevity even further than the few hundred years estimated for modern acid-free paper with a small alkaline buffer (although longevity depends a lot on storage conditions, particularly humidity, temperature, and exposure to direct sunlight — cool, dry, and dark is ideal)
On the first point, I think piqlFilm might be the current best option for storing recorded music for hundreds of years (or more). Vinyl records might also be quite good, although it's hard to find any reliable, hard data on their estimated longevity. Every source seems to just say about 100 years without any explanation of how that estimate was arrived at.
Movies and other video can be printed on film. Movie studios routinely store a copy of heavily digital movies like Avengers: Endgame on analog film as part of their standard backup strategy. Modern film, which uses the same polyester base as piqlFilm, apparently has good longevity and is expected to last for centuries when stored properly.
The part of doomsday archiving that is possibly more vexing than just storing the data is figuring out how to present information to a society in ruins that has suffered potentially decades or centuries of disruption, decline, or collapse from a major disaster like a nuclear war (God forbid). This is particularly salient when the storage medium requires building a certain level of technology to read it, which is the case for every medium except paper. But even apart from that, there are questions of how to present the information in a way that is understandable, that shows people where to start, and tells them what's most important. You can't necessarily rely on people in a future scenario like that knowing what to look for and if the pile of information is extremely large — e.g. if it contains millions of books — then it could be daunting.
If anyone wants to ask me random questions about this or related topics to see if my research has already turned up an answer for you, please ask away (even if you're seeing this comment a long time from now, although you may need to send me a private message that includes your email address).
Aaron Bergman @ 2025-11-05T02:27 (+3)
Interesting, thanks! I might actually sign up for the Arctic Archive thing! I don't see you mention m-discs like this - any reason for that?
Also, do you have any takes on how many physical locations a typical X is stored in, for various X?
X could be:
- A wikipedia page
- An EA Forum post
- A YouTube video
- A book that's sold 100/1k/10k/100k/1M copies
- Etc
Yarrow Bouchard🔸 @ 2025-11-05T14:39 (+3)
M-Discs are certainly interesting. What's complicated is that the company that invented M-Discs, Millenniata, went bankrupt, and that has sort of introduced a cloud of uncertainty over the technology.
There is a manufacturer, Verbatim, with the license to manufacture discs using the M-Disc standard and the M-Disc branding. Some customers have accused Verbatim of selling regular discs with the M-Disc branding at a huge markup and this accusation could be completely wrong and baseless — Verbatim has denied it — but it's sort of hard to verify what's going on anymore.
If Millenniata were still around, they would be able to tell us for sure whether Verbatim is still complying properly with the M-Disc standard and whether we can rely on their discs. I don't understand the nuances of optical disc storage well enough to really know what's going on. I would love to see some independent third-party who has expertise in this area and who is reputable and trustworthy tell us whether the accusations against Verbatim are really just a big misunderstanding.
Millenniata's bankruptcy is an example of the unfortunate economics of archival storage media. Rather than pay more for special long-lasting media, it's far more cost-effective to use regular, short-term storage media — today, almost entirely hard drives — and periodically copy over the data to new media. This means the market for archival media is small.
As for how many physical locations digital data is kept in, that depends on what it is. The CLOCKSS academic archive keeps digital copies of 61.4 million academic papers and 550,000 books in 12 distinct physical locations. I don't know how Wikipedia does its backups, mirroring, or archiving internally, but every month an updated copy of the English Wikipedia is released that anyone can download. Given Wikipedia's openness, it is unusually well-replicated across physical locations, just considering the number of people who download copies.
I also don't know how the EA Forum manages its backups or archiving internally, but a copy of posts can be saved using the Wayback Machine, which will create at least 2 additional physical copies on the Internet Archive's servers. I don't know what Google does with YouTube videos. I think for Google Drive data they keep enough data to recover files in at least two physically separate datacentres, but those could be two datacentres in the same region. I also don't know if they do the same for YouTube data — I hope so.
I think in the event of a global catastrophe like a nuclear war, what we should think about is not whether the data would physically survive somewhere on a hard drive, but, more practically, whether it would ever actually be recovered. If society is in ruins, then it doesn't really matter if the data physically survives somewhere unless it can be accessed and continually copied over so that it's preserved. Since hard drives last for such a short time, the window of time for society to recover enough to find, access, and copy the data from hard drives is quite narrow.
I don't know if you were asking about paper books or ebooks, but for paper books, it seems clear that for any book on the New York Times bestseller list, there must be at least one copy of that book in many different libraries, bookstores, and homes in many locations. I don't know how to think about the probability of copies ending up in Argentina, Iceland, or New Zealand, but it seems like at least a lot of English bestsellers must end up in various libraries, stores, and homes in New Zealand.
Paper books printed on acid-free paper with a 2% alkaline reserve, which, as far as I understand, is the standard for paper books printed over the last 20 years or so, are expected to last over 100 years provided they are kept in reasonably cool, dry, and dark conditions. I'm not sure how exactly the longevity would be estimated to change for books kept in a tropical climate vs. a temperate one. The 2% alkaline reserve on the paper is so that as the natural acid in the cellulose in the paper is slowly released over time, the alkaline counteracts it and keeps the paper neutral. Paper is really such a fascinating technology and more miraculous than we give it credit for.
Vinyl records are more important for preserving culture — specifically music — rather than knowledge or information, but it's interesting that vinyl sales are so high and that vinyl would probably end up being the most important technology for the preservation of music in some sort of global disaster scenario. In 2024, the top ten bestselling albums on vinyl in the U.S. sold between 175,000 copies (for Olivia Rodrigo at #10) and 1,489,000 copies (for Taylor Swift at #1). The principle here is the same as for paper books. You have to imagine these records are spread out all over the United States. Given that both vinyl records and many of the same musicians are popular in other countries like Canada, the UK, Australia, and New Zealand, it seems likely there are many copies elsewhere in the world too.
Since looking into this topic, I have warmed considerably on vinyl. I didn't really get the vinyl trend before. I guess I still don't, really, but now I think vinyl is a wonderful thing, even if the reasons people are buying it are not that it makes the preservation of music more resilient to a global disaster.
I didn't need any convincing to be fond of paper books, but paper just seems more and more impressive the more I think about it.
JuanGarcia @ 2023-07-10T12:30 (+3)
May I suggest: https://allfed.info/research/publications-and-reports
Aaron Bergman @ 2023-07-10T16:51 (+3)
Done, thanks!
EdoArad @ 2023-07-10T07:52 (+3)
Note also that the Internet Archive (Wayback Machine) is working on an offline archive (which, if I understand correctly, is intended to be installable as a local server to have a copy of some parts of the web which you could "load" into a browser and navigate pages ordinarily).
I think it'd be cool to have a collection for effective altruism-related resources, which then maybe would picked up by some people saving offline storages.
Jeroen Willems @ 2023-07-09T08:53 (+2)
What's the likelihood that even in 'incredible' places there would be electricity? For some reason I always assumed there would basically be no electricity during a major global catastrophe, which is possibly incorrect. But does it make sense to have paper copies too? What's the trade-off here?
Aaron Bergman @ 2023-07-09T14:59 (+4)
Even given no electricity, copies stored physically in e.g. a flash drive or hard drive would persist until electricity could be supplied, I'm almost certain