Gideon Futerman's Quick takes

By GideonF @ 2025-03-02T14:26 (+6)

null

Gideon Futerman @ 2025-03-02T14:26 (+21)

I wish more work focused on digital minds really focused on answering the following questions, rather than merely investigating how plausible it is that digital minds similar to current day AI's could be sentient:

What does good sets of scenarios for post-AGI governance need to look like to create good/avoid terrible (or whatever normative focus we want) futures, assuming digital minds are the dominant moral patients going into the future 1a) How does this differ dependent on what sorts of things can be digital minds eg whether sentient AIs are likely to happen 'by accident' by creating useful AIs (including ASI systems or sub-systems) vs whether sentient AIs have to be delibrately built? How do we deal with this trade off?
Which of these good sets of scenarios need certain actions to be taken pre-ASI development (actions beyond simply ensuring we don't all die)? Therefore, what actions would we ideally take now to help bring about such good futures? This includes, in my view, what, if any, thicker concept of alignment than 'intent alignment' ought we to use.
Given the strategic, political, geopolitical and technological situation we are in, how, if at all, can we make concrete progress to this? We obviously can't just 'do research' and hope this solves everything. Rather, we ought to use this to guide specific actions that can have impact. I guess this step feels rather hard to do without 1 and 2, but also, as far as I can tell, no one is really doing this?

I'm sure someone has expressed this same set of questions elsewhere, but i've not seen them yet, and at least to me, seem pretty neglected and important

Ryan Greenblatt @ 2025-03-05T18:08 (+8)

I think work of the sort you're discussing isn't typically called digital minds work. I would just describe this as "trying to ensure better futures (from a scope-sensitive longtermist perspective) other than via avoiding AI takeover, human power grabs, or extinction (from some other source)".

This just incidentally ends up being about digital entities/beings/value because that's where the vast majority of the value probably lives.

The way you phrase (1) seems to imply that you think large fractions of expected moral value (in the long run) will be in the minds of laborers (AIs we created to be useful) rather than things intentionally created to provide value/disvalue. I'm skeptical.

Gideon Futerman @ 2025-03-06T16:37 (+2)

You're sort of right on the first point, and I've definitely counted that work in my views on the area. I generally prefer to refer to it as 'making sure the future goes well for non-humans' - but I've had that misinterpreted as just focused on animals. I

I think for me the fact that the minds will be non-human, and probably digital, matter a lot. Firstly, I think arguments for longtermism probably don't work if the future is mostly just humans. Secondly, the fact that these beings are digital minds, and maybe digital minds very different to us, means a lot of common responses that are given for how to make the future go well (eg make sure they're preferred government 'wins' the ASI race) definitely looks less promising me. Plus you run into trickier problems like what Carlsmith discusses in his Otherness and Control series, and on the other end, if conscious AIs are 'small minds' ala insects (lots of small conscious digital minds that are maybe not individually very smart) you run into a bunch of the same issues of how to adequately treat them. So this is sort of why I call it 'digital minds', but I guess thats fairly semantic.

On you're second point, I basically think it could go either way. I think this depends on a bunch of things, including if, how strong and what type (ie what values are encoded) of 'lock in' we get, how 'adaptive' consciousness is etc. At least to me, I could see it going either way (not saying 50-50 credence towards both, but my guess is I'm at least less skeptical than you). Also, its possible that these are the more likely scenarios to have abundant suffering (although this also isn't obvious to me given potential motivations for causing deliberate suffering).

Bradford Saad @ 2025-03-06T01:58 (+3)

I'd also like to see more work on digital minds macrostrategy questions such as 1-3. To that end, I'll take this opportunity to mention that the Future Impact Group is accepting applications for projects on digital minds (among other topics) through EoD on March 8 for its part-time fellowship program. I'm set to be a project lead for the upcoming cohort and would welcome applications from people who'd want to work with me on a digital minds macrostrategy project. (I suggest some possible projects here but am open to others.)

I think the other project leads listed for AI sentience are all great and would highly recommend applying to work with any of them on a digital minds project (though I'm unsure if any of them are open to macrostrategy projects).

GideonF @ 2025-04-04T10:52 (+2)

Are the annoying happy lightbulbs when you upvote something here to stay, or they just an April Fool's thing that haven't been removed yet?