Technoliberal's Quick takes

By Technoliberal @ 2026-05-14T17:13 (+1)

null

Technoliberal @ 2026-05-14T17:13 (+11)

I wanted to make this poll to see how the community views the speed/x-risk tradeoff. I'm personally 99% x-risk and 1% speed, so I would hard agree. My prediction is most people will agree, maybe a 70/30 split, but I'm curious to see.

Craig Green 🔸 @ 2026-05-14T19:04 (+5)

Initially I just calculated a naive expected value function and put 100% agree, but then I realized that I don't value realizing potential lives nearly as much as I value improving existing ones. While I do value realizing potential lives, the loss of them is not experienced by anyone other than present-day people like myself who think about them abstractly, which seems to me in sum to be less bad than the suffering otherwise avertible due to technological progress in the next 100 years. But I obviously haven't thought about this enough or I wouldn't have made my initial mistake.

Craig Green 🔸 @ 2026-05-14T19:56 (+3)

One thing I didn't consider in my revised answer is that I didn't actually do the math. Taking an existential event as literally causing the end of earth-originating life, the question is whether the difference in probability multiplied by the immediate mass extinction itself would represent more death and suffering than the avertible death and suffering occurring over a 100-year period. I just don't know. It seems unlikely that the avertible death and suffering amounts to as much as the amount caused by the mass-extinction event itself, but after multiplying by the difference in probability and acknowledging the ambiguity of the timeline proposed in this question, things become less clear. However, let's say that the probability-adjusted, undetermined-timing mass-extinction event does cause more suffering and death and I change my answer to 50% agree. I don't think this is what most people would interpret 50% agree to express.

I should also be clear that I'm taking the question to mean literally ending earth-originating life in more-or-less one, fell swoop. Obviously, traditional x-risks actually have a spectrum of severity, so this is not so straightforward to apply to real-world resource allocation.

Technoliberal @ 2026-05-14T21:22 (+1)

If I had to be more specific I would mean "reducing the probability of all humanity (and only humanity) dying in a few short days/weeks from 50% to 10%" by "significantly reduce existential risk".

Also, I disagree with your methods. X risks aren't especially bad because of all the utility lost (and "negative utility" created), they're bad because after they happen there's never any utility again. Unless apes re-evolve into humans and reestablish all of civilization all over again, but we're getting too hypothetical. What's 100, or even 1000 years of death and suffering compared to 10000 of utopia? If stalling/slowing down technological progress for 1000 years made the P(Doom) go from 50% to 1%, I would definitely take it. Unless of course you think utopia is gonna be some short lived thing, but I seriously doubt that.

Craig Green 🔸 @ 2026-05-14T22:32 (+5)

You are rightly grasping that we disagree, but I don't think you are understanding my view (and to be clear, reasonable people can disagree about this).

My wife and I are debating whether we will have more children or not. Having another child is desirable to us. So much so that she's willing to undergo the relatively risky process of child birth to have another one. However, failing to have another child is significantly less bad than losing one of our existing children, IMO. I'd even say that, failing to have 100 more children is significantly less bad than losing one of our existing children. The reason why is that the child who never existed is not sentient and so does not experience any deprivation. They do not suffer. And my suffering of that abstract loss is not nearly as bad as would be the suffering I would experience losing a living child who I know.

Now you may disagree with that, and mourn all the lost utility, and that is a reasonable perspective, but its not mine, and as you can see, this is a deeper philosophical difference and not some sort of misunderstanding about expected utility or something like that.

FYI, about this sentence: "X risks aren't especially bad because of all the utility lost ... they're bad because after they happen there's never any utility again." I don't really see a difference between these two statements.

Michael St Jules 🔸 @ 2026-05-15T00:12 (+5)

I agree with Craig here. I've written about problems with most conceptions of utility people use and describe alternatives that I think better match what Craig is saying in this sequence.

JoelMcGuire @ 2026-05-14T18:58 (+4)

I can’t respond because I don’t know what “significantly reduce” means. 0.01%? 10%?

Technoliberal @ 2026-05-14T20:53 (+1)

I would imagine "significantly reducing" as going from 50% to 10%, but I should have been more clear

John Salter @ 2026-05-15T09:02 (+3)

I would be willing to delay technological innovation by up to 100 years to significantly reduce existential risk

I think the question is too imprecise phrased to be answered precisely. When would the delay start? Over what time period would it be felt? (e.g. a 100% delay for 100 years is very different than 1% delay over 10,000 years)

I'm thus giving a directional answer assuming we're talking about whether seeking to dramatically reducing technological progress in exchange for safety is a feasible way to make the world a better place. I don't think this is, but I'm not sure.

My biggest gripe is that any attempt to reduce technological innovation dramatically would entail a bunch of side-effects that would degrade the quality of existence (e.g. requiring authoritarianism, moving power from cooperators to defectors, to people skilled at deception to people less skilled, incentivises fighting for a larger slice of the pie instead of expanding it as expanding it is far harder without improved technology)

Linch @ 2026-05-21T00:36 (+2)

I would be willing to delay technological innovation by up to 100 years to significantly reduce existential risk

Seems like an easy choice as written. The devils in the details re: practicality (eg under some models this would in practice increase x-risk substantially)

Technoliberal @ 2026-05-14T17:15 (+2)

Wrote a post about it, but the TL;DR is that extintion is THE worst case scenario. It is the end of all utility and completely irreversible, whereas progress can always be made at a later date.

dan.pandori @ 2026-05-14T19:37 (+3)

S risks are a thing. There exist fates worse than death.

Technoliberal @ 2026-05-14T21:12 (+1)

That's fair, but I imagine X risks and S risks are very heavily correlated. Especially in regards to "speed of progress", accelerationism will, in my view, obviously increase X risks (safety research takes time, the more time you have, the more time for research you have, the more research is done, therefore reducing risk) but also increase S risks (this is more personal opinion, but I don't think the current leaders of AI innovation have stuff like animal welfare in mind. if we just keep chugging along, the first ASI might not care about animals at all).

dan.pandori @ 2026-05-14T19:37 (+1)

'significantly reduce' could mean a lot of things. I'm answering as if this reduces absolute X-risk by 20% or more over the next 10 centuries.

Technoliberal @ 2026-06-10T00:40 (+3)

I had a question. Why do all the AI safety companies seem to do the opposite of AI safety? Anthropic keeps publicly releasing models (which means they can be accessed by billions of people), same for OpenAI, and while these models are unlikely to cause major problems, if you're releasing a product that is going to be used by billions of people you should make sure the product is around 99.9999% failure proof. Anthropic themselves have said "AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities" when referring to Mythos. Now sure, Fable is claimed to be "safe for general use", and maybe it is, but why take the risk? Especially after only around 2-3 months of safety testing? I would want a company that claims to be for AI safety to always err on the side of caution, but this frankly seems quite reckless.

Guy Raveh @ 2026-06-10T13:39 (+2)

I still maintain that publicly releasing models is the correct way to get any chance of good alignment research - you can't possibly believe that the researchers at Anthropic alone are enough to tackle the problem. It's a global problem and should have the opportunity for the global population to solve it.

Technoliberal @ 2026-06-10T14:02 (+1)

I don't know, maybe eventually it could help, but with these "cutting edge" coding models doesn't it seem irresponsible? what if the safeguards don't work? shouldn't you release the model publicly only after you've exhaustively patched every single possible jailbreak? (even then I would argue it's still better to not release it, since billions of people means hundreds of thousands of bad actors, and again, as an AI safety company with "cutting edge" models I wouldn't take any risks)

Guy Raveh @ 2026-06-11T22:29 (+2)

How can you "solve every possible jailbreak"? And is it worth it crippling large-scale research into safeguarding from future AI because of fears about what the current models might be capable of?

(My own answer is "maybe". It depends on how bad you think current models are for society - pretty bad in my opinion - vs. how likely you think it is an existentially-threatening AI will actually be born out of the current efforts).

Technoliberal @ 2026-06-12T02:08 (+1)

You can't solve every possible jailbreak, but you should solve every jailbreak humanly possible if you're to release an AI that is claimed to be almost superhuman at cyber skills. I think current models are mostly bad for society, but I also think there's a possibility that current models could achieve AGI. Maybe it's only a 4% chance, but again, why take the risk? what is there to gain (other than money)?

I don't understand how publicly releasing these models will help in researching AI safety (and when I say "AI safety" I mostly mean AGI alignment). I thought the whole point of an aligned AGI is that you don't have to tell it to do stuff correctly, it already knows what's correct, even more than you, so I don't see how letting anyone use the models will help in aligning them. I'm not an AI expert or anything, but to me it seems aligning AGI is less of a "we don't have enough data" problem and more of a "we don't even know where to start" problem.

Guy Raveh @ 2026-06-12T15:32 (+2)

We don't know how to align a possible AGI yet. The best we can hope for is that current models are close enough to whatever AGI is going to be, that trying to align them will teach us about aligning an AGI. This task, of trying to align them, is something that shouldn't just be left to researchers in AI companies.

David T @ 2026-06-12T16:48 (+2)

This task, of trying to align them, is something that shouldn't just be left to researchers in AI companies

In principle I agree.

But would you say that people's suitability to align AI safely (or more specifically ensuring that Fable does not write nasty software exploits) is defined less by their expertise and alignment with Anthropic's stated mission and more by how much money they can spend on credits?

Because that's what Anthropic and the impending IPO marketing is asking you to believe

(tbh I'm not concerned by Fable manipulating its way into world domination. But if I was, I'd be extremely concerned that our most dedicated defenders against manipulative AI agents might be the sort of people who still take statements put out by AI companies at face value)

Technoliberal @ 2026-06-13T07:05 (+1)

This task, of trying to align them, is something that shouldn't just be left to researchers in AI companies.

Why? I would find an AI expert is much more suited to align a potential AGI than any common person. I just don't see how the common person could contribute to alignment. If anything, I can see how they would contribute to DISalignment (engineering better jailbreaks, using the models for nefarious purposes, giving the models "bad values" (like "cause as much damage as possible"), etc.). I think I value existential risk above all else, and I can't imagine publicly releasing "almost superhuman" models can decrease it.

Guy Raveh @ 2026-06-13T22:45 (+2)

But you're not claiming that the models should only be shared with AI researchers. You're claiming they should only be shared with AI researchers specifically employed by Anthropic.

Although no, I disagree that the input from non-AI-researchers is useless here - as you need to hear both from the end users and from people affected by AI and its decisions.

Technoliberal @ 2026-06-14T11:15 (+1)

I'm thinking more of the "endgame" here, so I think the input from non-researchers is no more valuable than the input of the researchers (as in, any useful information you could obtain about AI safety can be obtained just from the researchers alone). To be specific, I believe something along the lines of AI 2027 is gonna be the somewhat-near future, so I wanna restrict access to advanced models as much as possible.

Think of it like nuclear bombs. If you had a technology that powerful, you wouldn't want to risk any bad actors getting access to it, so you limit the amount of owners as much as possible. It would be pretty ridiculous to want private companies to be able to own, or even use nuclear weapons, and I think the case is pretty similar for current and future AI.