Future Academy - Successes, Challenges, and Recommendations to the Community

By SebastianSchmidt, Vilhelm Skoglund, Lowe Lundin @ 2023-07-08T20:15 (+61)

Introduction

Impact Academy is a new field-building and educational institution seeking to enable people to become world-class leaders, thinkers, and doers, using their careers and character to solve the world’s most pressing problems and create the best possible future. Impact Academy was founded by Vilhelm Skoglund, Sebastian Schmidt, and Lowe Lundin. We have already secured significant funding to set up the organization and carry out ambitious projects in 2023 and beyond. Please read this document for more about Impact Academy, our Theory of Change, and our two upcoming projects.

The purpose of this document is to provide an extensive evaluation and reflection on Future Academy - our first program (and experiment). Future Academy aimed to equip university students and early-career professionals worldwide with the thinking, skills, and resources they need to pursue ambitious and impactful careers. It was a free six-month program consisting of four in-person weekends with workshops, presentations, socials, and monthly digital events. Furthermore, the 21 fellows worked on an impact project with an experienced mentor and received professional coaching to empower them to increase their impact and become their best selves. Upon completion of the program, all participants went to a global impact conference (EAGx Nordics) where four fellows presented their projects. We awarded stipends of a total of $20,000 to the best projects. The projects included a sentiment analysis of public perception of AI risk, a philosophy AI alignment paper, and an organization idea for improving research talent in Tanzania. Our faculty included entrepreneurs and professors from Oxford University and UC Berkeley.

Note that this document attempts to assess to what extent we’ve served the world. This involves an assessment of the wonderful fellows who participated in Future Academy, and our ability to help them. This is not meant as an evaluation of peoples’ worth nor a definite score of general abilities, but an evaluation of our ability to help. We hope we do not offend anyone and have tried our best not to do so, but if you think we write anything inappropriate, please let us know in the comments or by reaching out to sebastian [at] impactacademy.org.

Main results and successes

We confirmed a key hypothesis underlying Future Academy - namely that we can attract promising and talented people who i) have no to moderate knowledge of Effective Altruism and longtermism, ii) are coming from underserved regions (e.g., 40% came from Southern and Eastern Europe), and are more diverse (e.g., 56% were female).
We created a bespoke primary metric called counterfactual expected career contribution (CECC) inspired by 80,000 hours’ IASPC metric and Centre for Effective Altruism’s HEA metric. We think the total score was 22.8, and ~ four fellows made up the majority of that score.
- To give an understanding of the CECC metric, we’ll give an example. Take an imaginary fellow, Alice. Before the intervention, based on our surveys and initial interactions, we expected that she may have an impactful career, but that she is unlikely to pursue a priority path based on IA principles. We rate her Expected Career Contribution (ECC) to be 2. After the program, based on surveys and interactions, we rate her as 10 (ECC) because we have seen that she’s now applying for a full-time junior role in a priority path guided by impartial altruism. We also asked her (and ourselves) to what extent that change was due to IA and estimated that to be 10%. To get our final Counterfactual Expected Career Contribution (CECC) for Alice, we subtract her initial ECC score of 2 from her final score of 10 to get 8, then multiply that score by 0.1 to get the portion of the expected career contribution, which we believe we are responsible for. The final score is 0.8 CECC. As a formula: 10 (ECC after the program) - 2 (ECC before the program) * 0.1 (our counterfactual influence) = 0.8 CECC. See here for a more detailed description of the metric.
- It seems as if we played a substantial counterfactual role (~50%) in bringing about three-four people with an expected career contribution of 10. This corresponds to someone who is now actively searching for a full-time junior role or got an internship in a priority path and is on track to making a senior-level contribution - see an example of a case study here.
- There were 11 fellows with a CECC score of 2 (which means they may or may not become a 10 eventually), and where we would’ve played a substantial counterfactual role (~50%) in their progress if that were to happen. See an example of a case study here.
We had a control group of the top 30 rejectees (only 14/30 responded to our surveys). Thus, this was a quasi-experiment - not a randomized control trial. 70% (10 people) of the control group didn’t do anything similar to Future Academy, and of those who did, we think it’s likely that two of them would’ve been better off by doing Future Academy, whereas one of them was probably better off not doing Future Academy and one where we are unsure. The control group had a final ECC that was 40% lower than the fellows, which might support our assumption of being able to create counterfactual impact via programs like Future Academy. This supports our assumption of being able to counterfactually engage people in EA.
The fellows were very satisfied, with an average satisfaction score of 9.6/10 (with an 86% response rate). We asked the control group how satisfied they were with what they did instead of Future Academy. The average satisfaction was 6.7.
Future Academy seems to have increased the fellows' number of connections relevant for doing good substantially. The average number (before) was 3. The average number (after) was 17. This corresponds to an increase by a factor of 4.5 (compared to the control group, which was increased by a factor of 2).
92% of the fellows reported that they changed their career in ways they think will make it more impactful (compared to 71% of the control group).
Our focus on well-being and general human development via coaching appears to have been a success. For example, two out of the four well-being metrics had a statistically significant increase with a moderate-strong effect size. The two other metrics didn’t change in statistically significant ways. Relatedly, it increased the overall engagement with the program (fellows reported an average engagement increase of 30.5% due to coaching) while providing career-related outputs (e.g., being more ambitious and optimistic about the impact they can have).
We were excited about awarding $20,000 to the best projects.
The total cost (including staff salary) of Future Academy was $229,726, which corresponds to $11,000/fellow and ~ $10,000/Counterfactual Expected Career Contributions (CEEC). The total FTE of the core staff was ~ 2.

Main challenges and mistakes

Impact evaluation presents multiple challenges. We take all of the results with substantial uncertainty and worry that our conclusions could easily change within six months. Relatedly, the assessment of our control group could be much more rigorous.
Three fellows reported having had difficult experiences with one of the organizers - e.g., feeling that the organizer had been too pushy with some of the EA ideas. This made them feel uncomfortable and less prone to engage with the effective altruism community. None of the fellows are hurting in any way - they shared the information to help Future Academy thrive. We took this feedback very seriously, solicited advice from around ten people, and considered a wide range of options for moving forward with the organizer (including whether the person should continue with us). We eventually decided to continue as we trust the person’s ability to grow and saw that the person created a convincing development plan for improvement.
We needed to spend more time and resources on marketing and attracting applications to the program. We weren’t particularly systematic in the recruitment process, as multiple individuals were involved, and we want to set aside more time and resources for both marketing and application processing. With a simple budget of $2,000 at least three participants found the program through paid advertising (“eyeballs can be bought”).
The fellows didn’t get as much out of the Impact Projects and mentoring as we would’ve liked. For example, 3/18 didn’t hand in a project, only four fellows worked in groups, and some mentor-mentee-pairings seemed not to have taken off. However, compared to other programs (e.g., SERI), it seems as if this was a high rate of project completion.
We needed more normal lectures and more time for active learning and a flipped classroom approach. Relatedly, the exercises prepared for the workshop elements were of varying quality - e.g., they hadn’t been test-run.

Conclusions for Impact Academy

Overall, we think Future Academy was a successful experiment as we were satisfied with how we ran the program and the outcomes of the program. However, there was significant room for improvement, and we don’t want to run Future Academy in its exact form again. We’ve decided to run another version of Future Academy where we will continue to primarily target people who

Have no or moderate knowledge of and engagement with EA/longtermism and
Are from underserved regions and groups.

We’ll also update the program to reflect best practices within education, the science of learning, and other programs we think highly of. Finally, we’ll also explore the feasibility of targeting early-mid career professionals as the wider community seems to be very interested in individuals with 3+ years of experience.

You can learn more about the other project (an AI governance fellowship) we will be running here.

Recommendations for the EA community

Based on our experience with Future Academy, we think these recommendations might provide value to the EA community as a whole:

More focus on recruiting from underserved communities (like Eastern and Southern Europe and specific regions in Africa and Asia) - it’s possible!
Consider adding a component around well-being and general human development via coaching. For example, two out of the four well-being metrics had a statistically significant increase with a moderate-strong effect size. The two other metrics didn’t change in statistically significant ways. Relatedly, it increased the overall engagement with the program (fellows reported an average engagement increase of 30.5% due to coaching) while providing career-related outputs (e.g., being more ambitious and optimistic about the impact they can have).
Learn from educational best practices. Many things related to field-building can be modeled as education. From the design of programs (e.g., by backward chaining from where you want fellows to end up) to the impact evaluation (it can be similar to grading hard things like essays). See this excellent blog post by Michael Noetel for more.
Be more rigorous and systematic about impact evaluation. E.g., creating a baseline estimate or a simple control group by sending a survey with a $20 Amazon gift card to top rejectees led to a surprisingly high response rate. That said, impact evaluation is hard, and we’d be interested in exploring some form of external evaluation. For instance, multiple organizations could hire someone to do it collectively for them.
Run more ambitious experiments, including experiments that are not EA-branded.

Acknowledgments

We are incredibly grateful for having been able to deliver Future Academy to everyone who supported us. Our funders who believed in the idea. Our speakers who were eager to give some of their time and travel. Our fellows who took a chance on us by being open-minded (and crazy enough) to join a completely new program. Our mentors who guided the fellows while they were finishing their project. Our own mentors (very much including Michael Noetel) for providing us with ongoing feedback and guidance and helping us set a high bar. To the people who provided feedback on our evaluation (Anine Andresen, Henri Thunberg, Eirik Mofoss, Emil Wasteson, Vaidehi Agarwalla, Cian Mullarkey, Jamie Harris, Varun Agrawal, Cillian Crosson, Raphaëlle Cohen, and Toby Tremlett). Finally, to the wider community of do-gooders who collaboratively provided input on everything from the program design to the impact evaluation. Thank you!

rileyharris @ 2023-07-09T23:22 (+7)

Great to see attempts to measure impact in such difficult areas. I'm wondering if there's a problem of attribution that looks like this (I'm not up to date on this discussion):

An organisation like the Future Academy or 80,000 hours or someone says "look, we probably got this person into a career in AI safety, which has a higher impact, and cost us $x, so our impact per dollar is $x per probable career spent on AI safety".
The person goes to do a training program, and they say "we trained this person to do good work in AI safety, which allows them to have an impact, and it only cost us $y to run the program, so our impact is $y per impactful career in AI safety"
The person then goes on to work at a research organisation, who says "we spent $z including salary and overheads on this researcher, and they produced a crucial seeming alignment paper, so our impact is $z per crucial seeming alignment paper".

When you account for this properly, it's clear that each of these estimates is too high, because part of the impact and cost has to be attributed elsewhere.

A few off the cuff thoughts:

It seems there should be a more complicated discounted measure of impact here for each organisation, that takes into account additional costs.

It certainly could be the case that at each stage the impact is high enough to justify the program at the discounted rate.

This might be a misunderstanding of what you're actually doing, in which case I would be excited to learn that you (and similar organisations) already accounted for this!

I don't mean to pick on any organisation in particular if no one is doing this, it's just a thought about how these measures could be improved in general.

SebastianSchmidt @ 2023-07-12T09:27 (+3)

Hi Riley,
Thanks a lot for your comment. I'll mainly speak to our (Impact Academy) approach to impact evaluation but I'll also share my impressions with the general landscape.

Our primary metric (*counter-factual* expected career contributions) explicitly attempts to take this into account. To give an example of how we roughly evaluate the impact:

Take an imaginary fellow, Alice. Before the intervention, based on our surveys and initial interactions, we expected that she may have an impactful career, but that she is unlikely to pursue a priority path based on IA principles. We rate her Expected Career Contribution (ECC) to be 2. After the program, based on surveys and interactions, we rate her as 10 (ECC) because we have seen that she’s now applying for a full-time junior role in a priority path guided by impartial altruism. We also asked her (and ourselves) to what extent that change was due to IA and estimate that to be 10%. To get our final Counterfactual Expected Career Contribution (CECC) for Alice, we subtract her initial ECC score of 2 from her final score of 10 to get 8, then multiply that score by 0.1 to get the portion of the expected career contribution which we believe we are responsible for. The final score is 0.8 CECC. As an formula: 10 (ECC after the program) - 2 (ECC before the program) * 0.1 (our counterfactual influence) = 0.8 CECC.

You can read more here: https://docs.google.com/document/d/1Pb1HeD362xX8UtInJtl7gaKNKYCDsfCybcoAdrWijWM/edit#heading=h.vqlyvfwc0v22

I have the sense that other orgs are quite careful about this too. E.g., 80,000hours seems to think that they only caused a relatively modest amount of significant career changes because they discovered that the people had updated significantly due to reasons not related to 80,000hours.