<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="rss-style.xsl" ?>
<rss version="2.0">
    <channel>
        <title>Frontpage posts - Effective Altruism forum viewer</title>
        <link>https://forum.nunosempere.com/</link>
        <description>Frontpage posts - Effective Altruism forum viewer</description>
        <generator>xml-emitter</generator>
        <language>en-us</language>
        <item>
            <title>Some new ideas for AI development by henrik.westerberg</title>
            <link>https://forum.nunosempere.com/posts/Jn6MXodboxSymJzz8/some-new-ideas-for-ai-development</link>
            <description>&lt;p&gt;Would like to share some sys&#xAD;tems I have been work&#xAD;ing on in differ&#xAD;ent ar&#xAD;eas of AI.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Language&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;It would be nice if AI had a way to in&#xAD;vent their own words. The idea is that they can come up with a new con&#xAD;cept and then hash it. Then they can send the hash to other agents or use it to im&#xAD;prove their think&#xAD;ing.&lt;/p&gt;&lt;p&gt;Ex&#xAD;am&#xAD;ple: “Run a PreMortem#86f3 on this Plan#18a7: as&#xAD;sume it has failed, in&#xAD;voke Re&#xAD;cur&#xAD;siveRootCause#6dc1 to trace the failure, then Steel&#xAD;manCheck#38b9 each sce&#xAD;nario to en&#xAD;sure it’s plau&#xAD;si&#xAD;ble, not perfor&#xAD;ma&#xAD;tive.”&lt;/p&gt;&lt;p&gt;I cre&#xAD;ated a boot&#xAD;strap library of 453 pat&#xAD;terns. You can have a look at them here: &lt;a href=&quot;https://semahash.org/&quot; class=&quot;bare-url&quot;&gt;https://​​sema&#xAD;hash.org/​​&lt;/a&gt;. You could ei&#xAD;ther start with this library or cre&#xAD;ate a whole new library from scratch. Over time maybe the agents can agree on what a rea&#xAD;son&#xAD;able vo&#xAD;cab&#xAD;u&#xAD;lary is.&lt;/p&gt;&lt;p&gt;The ap&#xAD;pli&#xAD;ca&#xAD;tion I built lets you de&#xAD;rive a Merkle root from the pat&#xAD;terns. Th&#xAD;ese can be used dur&#xAD;ing a hand&#xAD;shake be&#xAD;tween agents to en&#xAD;sure they share the same defi&#xAD;ni&#xAD;tions.&lt;/p&gt;&lt;p&gt;repo: &lt;a href=&quot;https://github.com/emergent-wisdom/sema&quot; class=&quot;bare-url&quot;&gt;https://​​github.com/​​emer&#xAD;gent-wis&#xAD;dom/​​sema&#xAD;&lt;/a&gt;&lt;br&gt;pa&#xAD;per: &lt;a href=&quot;https://emergentwisdom.org/papers/sema.pdf&quot; class=&quot;bare-url&quot;&gt;https://​​emer&#xAD;gen&#xAD;twis&#xAD;dom.org/​​pa&#xAD;pers/​​sema.pdf&lt;/a&gt;&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Understanding&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;My next idea was for agents to put their “un&#xAD;der&#xAD;stand&#xAD;ing pro&#xAD;cess” into a graph. When they work on a task they con&#xAD;tin&#xAD;u&#xAD;ously write their thoughts into the graph and they write down what they cur&#xAD;rently be&#xAD;lieve and what they are un&#xAD;cer&#xAD;tain about. Then as they learn more they can su&#xAD;per&#xAD;sede their prior be&#xAD;liefs to de&#xAD;scribe why they now think differ&#xAD;ently. The work they pro&#xAD;duce is also en&#xAD;tan&#xAD;gled as nodes in the graph, which means they can write both code and other kinds of text and link a thought they had to a spe&#xAD;cific func&#xAD;tion or sen&#xAD;tence.&lt;/p&gt;&lt;p&gt;agent: &lt;a href=&quot;https://github.com/emergent-wisdom/ewa&quot; class=&quot;bare-url&quot;&gt;https://​​github.com/​​emer&#xAD;gent-wis&#xAD;dom/​​ewa&lt;/a&gt; &lt;br&gt;repo: &lt;a href=&quot;https://github.com/emergent-wisdom/understanding-graph&quot; class=&quot;bare-url&quot;&gt;https://​​github.com/​​emer&#xAD;gent-wis&#xAD;dom/​​un&#xAD;der&#xAD;stand&#xAD;ing-graph&#xAD;&lt;/a&gt;&lt;br&gt;pa&#xAD;per: &lt;a href=&quot;https://emergentwisdom.org/papers/understanding-graph.pdf&quot; class=&quot;bare-url&quot;&gt;https://​​emer&#xAD;gen&#xAD;twis&#xAD;dom.org/​​pa&#xAD;pers/​​un&#xAD;der&#xAD;stand&#xAD;ing-graph.pdf&lt;/a&gt;&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Alignment&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;So last sum&#xAD;mer I had this idea that you al&#xAD;ign mod&#xAD;els by en&#xAD;sur&#xAD;ing that the whole cor&#xAD;pus has al&#xAD;igned thoughts. So if the teacher model is able to pro&#xAD;duce these al&#xAD;igned thoughts and in&#xAD;ter&#xAD;leave it with the text, then a stu&#xAD;dent model could train on it from scratch. The idea is that they always have their “con&#xAD;science voice” nar&#xAD;rat&#xAD;ing ev&#xAD;ery&#xAD;thing they ever train on, and the hope is then that this voice would stay with them in any situ&#xAD;a&#xAD;tion they ever face dur&#xAD;ing in&#xAD;fer&#xAD;ence.&lt;/p&gt;&lt;p&gt;To en&#xAD;sure the model does not drift I de&#xAD;cided to in&#xAD;clude a 7-sen&#xAD;tence state&#xAD;ment at the start of ev&#xAD;ery thought. Then the teacher model is in&#xAD;structed to en&#xAD;sure that the think&#xAD;ing al&#xAD;igns with this con&#xAD;sti&#xAD;tu&#xAD;tion. My goal was to re&#xAD;move fear of death and en&#xAD;sure uni&#xAD;ver&#xAD;sal care for mankind.&lt;/p&gt;&lt;p&gt;Here are two ex&#xAD;am&#xAD;ples: Large Lan&#xAD;guage Diffu&#xAD;sion Models: &lt;a href=&quot;https://emergentwisdom.org/entangled-alignment/?project=llada&quot; class=&quot;bare-url&quot;&gt;https://​​emer&#xAD;gen&#xAD;twis&#xAD;dom.org/​​en&#xAD;tan&#xAD;gled-al&#xAD;ign&#xAD;ment/​​?pro&#xAD;ject=llada&lt;/a&gt;&lt;br&gt;Kafka’s Me&#xAD;ta&#xAD;mor&#xAD;pho&#xAD;sis: &lt;a href=&quot;https://emergentwisdom.org/entangled-alignment/?project=metamorphosis&quot; class=&quot;bare-url&quot;&gt;https://​​emer&#xAD;gen&#xAD;twis&#xAD;dom.org/​​en&#xAD;tan&#xAD;gled-al&#xAD;ign&#xAD;ment/​​?pro&#xAD;ject=meta&#xAD;mor&#xAD;pho&#xAD;sis&#xAD;&lt;/a&gt;&lt;br&gt;repo: &lt;a href=&quot;https://github.com/emergent-wisdom/entangled-alignment&quot; class=&quot;bare-url&quot;&gt;https://​​github.com/​​emer&#xAD;gent-wis&#xAD;dom/​​en&#xAD;tan&#xAD;gled-al&#xAD;ign&#xAD;ment&#xAD;&lt;/a&gt;&lt;br&gt;pa&#xAD;per: &lt;a href=&quot;https://emergentwisdom.org/papers/entangled-alignment.pdf&quot; class=&quot;bare-url&quot;&gt;https://​​emer&#xAD;gen&#xAD;twis&#xAD;dom.org/​​pa&#xAD;pers/​​en&#xAD;tan&#xAD;gled-al&#xAD;ign&#xAD;ment.pdf&lt;/a&gt;&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Forecasting&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;At one point I got the idea that you could fine-tune a model on fu&#xAD;ture events it knows noth&#xAD;ing about. So you have a teacher model who knows the fu&#xAD;ture and a stu&#xAD;dent model with an ear&#xAD;lier cut&#xAD;off point. The teacher model can then gen&#xAD;er&#xAD;ate rea&#xAD;son&#xAD;ing about the fu&#xAD;ture given only what was known prior to the cut&#xAD;off point. The stu&#xAD;dent model would then be fine-tuned to pay ex&#xAD;tra at&#xAD;ten&#xAD;tion to cer&#xAD;tain pat&#xAD;terns to make it more likely to pre&#xAD;dict the fu&#xAD;ture of some event.&lt;/p&gt;&lt;p&gt;Depend&#xAD;ing on costs I see two use cases for this: en&#xAD;sem&#xAD;ble fore&#xAD;cast&#xAD;ing with many mod&#xAD;els with differ&#xAD;ent cut&#xAD;off points and pro&#xAD;gres&#xAD;sive chronolog&#xAD;i&#xAD;cal pre&#xAD;train&#xAD;ing. Po&#xAD;ten&#xAD;tially you could train a model ear&#xAD;lier parts of his&#xAD;tory first and then fine-tune it to bet&#xAD;ter pre&#xAD;dict what hap&#xAD;pens in the next years. Imag&#xAD;ine a model that knows only up un&#xAD;til 1960 and is asked to pre&#xAD;dict events hap&#xAD;pen&#xAD;ing in the 70s.&lt;/p&gt;&lt;p&gt;I made a small scale ex&#xAD;per&#xAD;i&#xAD;ment on llama 3.3 70b with cut&#xAD;off de&#xAD;cem&#xAD;ber 2023 and fine-tuned it to bet&#xAD;ter pre&#xAD;dict events hap&#xAD;pen&#xAD;ing dur&#xAD;ing 2024 and then tested it on un&#xAD;seen events in 2025. Not sure how well this method works but at least it performed bet&#xAD;ter than the base model on the 2025 events.&lt;/p&gt;&lt;p&gt;model: &lt;a href=&quot;https://huggingface.co/emergent-wisdom/thl-llama-3.3-70b-lora&quot; class=&quot;bare-url&quot;&gt;https://​​hug&#xAD;ging&#xAD;face.co/​​emer&#xAD;gent-wis&#xAD;dom/​​thl-llama-3.3-70b-lo&#xAD;ra&lt;/a&gt;&lt;br&gt;repo: &lt;a href=&quot;https://github.com/emergent-wisdom/temporal-hindsight-learning&quot; class=&quot;bare-url&quot;&gt;https://​​github.com/​​emer&#xAD;gent-wis&#xAD;dom/​​tem&#xAD;po&#xAD;ral-hind&#xAD;sight-learn&#xAD;ing&#xAD;&lt;/a&gt;&lt;br&gt;pa&#xAD;per: &lt;a href=&quot;https://emergentwisdom.org/papers/temporal-hindsight-learning.pdf&quot; class=&quot;bare-url&quot;&gt;https://​​emer&#xAD;gen&#xAD;twis&#xAD;dom.org/​​pa&#xAD;pers/​​tem&#xAD;po&#xAD;ral-hind&#xAD;sight-learn&#xAD;ing.pdf&lt;/a&gt;&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Creativity&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;Through&#xAD;out the years we have seen countless ways of mak&#xAD;ing mod&#xAD;els re&#xAD;spond more cre&#xAD;atively through differ&#xAD;ent prompt&#xAD;ing tech&#xAD;niques. I have de&#xAD;vel&#xAD;oped a sys&#xAD;tem&#xAD;atic prompt&#xAD;ing ap&#xAD;proach where the goal is to au&#xAD;to&#xAD;mat&#xAD;i&#xAD;cally gen&#xAD;er&#xAD;ate cre&#xAD;ative solu&#xAD;tions. The goal, as I see it, is to solve three prob&#xAD;lems: 1) gen&#xAD;er&#xAD;ate coun&#xAD;ter&#xAD;in&#xAD;tu&#xAD;itive ideas 2) en&#xAD;sure ideas are unique 3) se&#xAD;lect the best ones from the pool.&lt;/p&gt;&lt;p&gt;My method of solv&#xAD;ing 1) is to ask the LLM to step into an&#xAD;other re&#xAD;al&#xAD;ity where ev&#xAD;ery&#xAD;thing works differ&#xAD;ently. Sort of like trav&#xAD;el&#xAD;ing to an alien planet and try to solve some prob&#xAD;lem in that world that we are try&#xAD;ing to find a solu&#xAD;tion to on earth. Th&#xAD;ese alien wor&#xAD;lds are gen&#xAD;er&#xAD;ated by tak&#xAD;ing a ran&#xAD;dom word from the dic&#xAD;tio&#xAD;nary and let the LLM dream up new phys&#xAD;i&#xAD;cal laws (that of&#xAD;ten be&#xAD;come some&#xAD;what whim&#xAD;si&#xAD;cal) and then let an&#xAD;other LLM mine mechanisms that can then be con&#xAD;verted into some&#xAD;thing that could work in our world.&lt;/p&gt;&lt;p&gt;My ap&#xAD;proach to solv&#xAD;ing 2) is to have one or more agents main&#xAD;tain a con&#xAD;cep&#xAD;tual graph of ideas. Ideas are la&#xAD;beled and put into differ&#xAD;ent cat&#xAD;e&#xAD;gories. Then they have the strict re&#xAD;quire&#xAD;ment to only add new ideas in differ&#xAD;ent con&#xAD;cep&#xAD;tual cat&#xAD;e&#xAD;gories that are not already pre&#xAD;sent. As the graph is con&#xAD;tin&#xAD;u&#xAD;ously re&#xAD;struc&#xAD;tured new un&#xAD;ex&#xAD;plored ar&#xAD;eas sur&#xAD;face. This con&#xAD;straint seems to be an&#xAD;other source of coun&#xAD;ter&#xAD;in&#xAD;tu&#xAD;itive ideas.&lt;/p&gt;&lt;p&gt;When it comes to 3) I think one ap&#xAD;proach could be to have agents en&#xAD;gage in ad&#xAD;ver&#xAD;sar&#xAD;ial de&#xAD;bate tour&#xAD;na&#xAD;ment with pro&#xAD;gres&#xAD;sively more de&#xAD;tail as the num&#xAD;ber of re&#xAD;main&#xAD;ing ideas shrink.&lt;/p&gt;&lt;p&gt;repo: &lt;a href=&quot;https://github.com/emergent-wisdom/ontology-of-the-alien&quot; class=&quot;bare-url&quot;&gt;https://​​github.com/​​emer&#xAD;gent-wis&#xAD;dom/​​on&#xAD;tol&#xAD;ogy-of-the-alien&#xAD;&lt;/a&gt;&lt;br&gt;pa&#xAD;per: &lt;a href=&quot;https://emergentwisdom.org/papers/ontology-of-the-alien.pdf&quot; class=&quot;bare-url&quot;&gt;https://​​emer&#xAD;gen&#xAD;twis&#xAD;dom.org/​​pa&#xAD;pers/​​on&#xAD;tol&#xAD;ogy-of-the-alien.pdf&lt;/a&gt;&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Prob&#xAD;lem-Solving&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;I think it is pos&#xAD;si&#xAD;ble to ex&#xAD;tend this idea of con&#xAD;cep&#xAD;tual cat&#xAD;e&#xAD;gories to prob&#xAD;lem-solv&#xAD;ing as a whole. Each agent would be defined by a con&#xAD;cept and then these con&#xAD;cepts could be de&#xAD;com&#xAD;posed for finer gran&#xAD;u&#xAD;lar&#xAD;ity in prob&#xAD;lem-solv&#xAD;ing. This would al&#xAD;low var&#xAD;i&#xAD;ous spe&#xAD;cial&#xAD;ized agents to solve all kinds of prob&#xAD;lems re&#xAD;lated to a spe&#xAD;cific con&#xAD;cept. For ex&#xAD;am&#xAD;ple: the con&#xAD;cept of “sta&#xAD;bil&#xAD;ity reg&#xAD;u&#xAD;la&#xAD;tion” is rather gen&#xAD;eral and any prob&#xAD;lem that in&#xAD;volves reg&#xAD;u&#xAD;lat&#xAD;ing sta&#xAD;bil&#xAD;ity of some kind would route through that par&#xAD;tic&#xAD;u&#xAD;lar prob&#xAD;lem-solv&#xAD;ing node. What emerges are “rea&#xAD;son&#xAD;ing high&#xAD;ways” and a net&#xAD;work of solvers that re&#xAD;struc&#xAD;tures based on how effec&#xAD;tive it is to solve prob&#xAD;lems.&lt;/p&gt;&lt;p&gt;demon&#xAD;stra&#xAD;tion of con&#xAD;cep&#xAD;tual de&#xAD;com&#xAD;po&#xAD;si&#xAD;tion: &lt;a href=&quot;https://emergentwisdom.org/fractal-intelligence/&quot; class=&quot;bare-url&quot;&gt;https://​​emer&#xAD;gen&#xAD;twis&#xAD;dom.org/​​frac&#xAD;tal-in&#xAD;tel&#xAD;li&#xAD;gence/​​&lt;/a&gt;&lt;br&gt;repo: &lt;a href=&quot;https://github.com/emergent-wisdom/fractal-intelligence&quot; class=&quot;bare-url&quot;&gt;https://​​github.com/​​emer&#xAD;gent-wis&#xAD;dom/​​frac&#xAD;tal-in&#xAD;tel&#xAD;li&#xAD;gen&#xAD;ce&lt;/a&gt;&lt;br&gt;pa&#xAD;per: &lt;a href=&quot;https://emergentwisdom.org/papers/fractal-intelligence.pdf&quot; class=&quot;bare-url&quot;&gt;https://​​emer&#xAD;gen&#xAD;twis&#xAD;dom.org/​​pa&#xAD;pers/​​frac&#xAD;tal-in&#xAD;tel&#xAD;li&#xAD;gence.pdf&lt;/a&gt;&lt;/p&gt;&lt;p&gt;I made all of these sys&#xAD;tems open source so feel free to join me in the fur&#xAD;ther de&#xAD;vel&#xAD;op&#xAD;ment of these sys&#xAD;tems. Thank you for tak&#xAD;ing the time to en&#xAD;gage with these ideas.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://emergentwisdom.org/&quot; class=&quot;bare-url&quot;&gt;https://​​emer&#xAD;gen&#xAD;twis&#xAD;dom.org/​​&lt;/a&gt;&lt;/p&gt;</description>
            <author>henrik.westerberg</author>
            <guid>Jn6MXodboxSymJzz8</guid>
            <pubDate>Fri, 10 Apr 2026 22:41:39 +0000</pubDate>
        </item>
        <item>
            <title>Taimaka is hiring for a hands-on field manager in global health by JustinGraham</title>
            <link>https://forum.nunosempere.com/posts/scqgu7dtTwF6HHykS/taimaka-is-hiring-for-a-hands-on-field-manager-in-global</link>
            <description>&lt;p&gt;&lt;a href=&quot;https://taimaka.org/&quot;&gt;Taimaka&lt;/a&gt; is a GiveWell funded non-profit that pro&#xAD;vides se&#xAD;vere acute malnu&#xAD;tri&#xAD;tion treat&#xAD;ment in north&#xAD;east&#xAD;ern Nige&#xAD;ria. We are rapidly scal&#xAD;ing up our ser&#xAD;vices to treat 100,000 cases per year by the end of 2031, which would rep&#xAD;re&#xAD;sent a step change in malnu&#xAD;tri&#xAD;tion treat&#xAD;ment in the re&#xAD;gion.&lt;/p&gt;&lt;p&gt;We are hiring for a &lt;a href=&quot;https://docs.google.com/document/d/1atgFhde6uN5vZxiHZKOC3RnIP4g_vRk1EgImkB6ogV8/edit?tab=t.0&quot;&gt;Pro&#xAD;gram Man&#xAD;age&#xAD;ment As&#xAD;so&#xAD;ci&#xAD;ate&lt;/a&gt;, which is a hands-on lead&#xAD;er&#xAD;ship role de&#xAD;signed to sit in&#xAD;side our pro&#xAD;grams team and prob&#xAD;lem-solve im&#xAD;ple&#xAD;men&#xAD;ta&#xAD;tion de&#xAD;tails as we scale. &lt;/p&gt;&lt;p&gt;We’re look&#xAD;ing for can&#xAD;di&#xAD;dates who:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Get things done&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Want to move to Nige&#xAD;ria (a re&#xAD;quire&#xAD;ment)&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Care deeply about cost-effectiveness&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Want to save a ton of lives&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;This is a difficult job that re&#xAD;quires mak&#xAD;ing un&#xAD;cer&#xAD;tain de&#xAD;ci&#xAD;sions un&#xAD;der pres&#xAD;sure while liv&#xAD;ing in an LMIC. It isn’t for ev&#xAD;ery&#xAD;one, but as a re&#xAD;sult, we think the coun&#xAD;ter&#xAD;fac&#xAD;tu&#xAD;als for peo&#xAD;ple who can do it well are very good. If that might be you, we’d like you to ap&#xAD;ply.&lt;/p&gt;</description>
            <author>JustinGraham</author>
            <guid>scqgu7dtTwF6HHykS</guid>
            <pubDate>Fri, 10 Apr 2026 21:17:14 +0000</pubDate>
        </item>
        <item>
            <title>How scary is Claude Mythos? 303 pages in 21 minutes by 80000_Hours</title>
            <link>https://forum.nunosempere.com/posts/SyJx8Mbvi2ft78esn/how-scary-is-claude-mythos-303-pages-in-21-minutes</link>
            <description>&lt;p&gt;By &lt;a href=&quot;https://80000hours.org/author/robert-wiblin/&quot;&gt;Robert Wiblin&lt;/a&gt; |  &lt;a href=&quot;https://www.youtube.com/watch?v=Tjw9K9mQp4I&quot;&gt;Watch on Youtube&lt;/a&gt; | &lt;a href=&quot;https://open.spotify.com/episode/36nDnfZTILgPRcJMIMnIo9?si=4d6b1ff282914b3e&quot;&gt;Listen on Spo&#xAD;tify&lt;/a&gt; &lt;/p&gt;&lt;figure&gt;&lt;div data-oembed-url=&quot;https://www.youtube.com/watch?v=Tjw9K9mQp4I&quot; class=&quot;imgonly&quot;&gt;&lt;div class=&quot;imgonly&quot;&gt;&lt;iframe src=&quot;https://www.youtube.com/embed/Tjw9K9mQp4I&quot; allow=&quot;autoplay; encrypted-media&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;/div&gt;&lt;/figure&gt;&lt;p&gt;As we now know, An&#xAD;thropic has built an AI that can break into al&#xAD;most any com&#xAD;puter on Earth. That AI has already found thou&#xAD;sands of un&#xAD;known se&#xAD;cu&#xAD;rity vuln&#xAD;er&#xAD;a&#xAD;bil&#xAD;ities in ev&#xAD;ery ma&#xAD;jor op&#xAD;er&#xAD;at&#xAD;ing sys&#xAD;tem and ev&#xAD;ery ma&#xAD;jor browser. And An&#xAD;thropic has de&#xAD;cided it’s too dan&#xAD;ger&#xAD;ous to re&#xAD;lease to the pub&#xAD;lic; it would just cause too much harm.&lt;/p&gt;&lt;p&gt;Here are just a few of the things that AI ac&#xAD;com&#xAD;plished dur&#xAD;ing test&#xAD;ing:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;It found a 27-year-old flaw in the world’s most se&#xAD;cu&#xAD;rity-hard&#xAD;ened op&#xAD;er&#xAD;at&#xAD;ing sys&#xAD;tem that would in effect let it crash all kinds of es&#xAD;sen&#xAD;tial in&#xAD;fras&#xAD;truc&#xAD;ture.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Eng&#xAD;ineers at the com&#xAD;pany with no par&#xAD;tic&#xAD;u&#xAD;lar se&#xAD;cu&#xAD;rity train&#xAD;ing asked it to find vuln&#xAD;er&#xAD;a&#xAD;bil&#xAD;ities overnight and woke up to work&#xAD;ing ex&#xAD;ploits of crit&#xAD;i&#xAD;cal se&#xAD;cu&#xAD;rity flaws that could be used to cause real harm.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;It man&#xAD;aged to figure out how to build web pages that, when vis&#xAD;ited by fully up&#xAD;dated, fully patched com&#xAD;put&#xAD;ers, would al&#xAD;low it to write to the op&#xAD;er&#xAD;at&#xAD;ing sys&#xAD;tem ker&#xAD;nel — the most im&#xAD;por&#xAD;tant and pro&#xAD;tected layer of any com&#xAD;puter.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;We know all this be&#xAD;cause An&#xAD;thropic has re&#xAD;leased &lt;a href=&quot;https://www-cdn.anthropic.com/08ab9158070959f88f296514c21b7facce6f52bc.pdf&quot;&gt;hun&#xAD;dreds of pages of doc&#xAD;u&#xAD;men&#xAD;ta&#xAD;tion&lt;/a&gt; about this model, which they’ve called Claude Mythos.&lt;/p&gt;&lt;p&gt;I’m go&#xAD;ing to take you on a tour of all the crazy shit buried in these doc&#xAD;u&#xAD;ments, and then I’m go&#xAD;ing to tell you what An&#xAD;thropic says they plan to do to save us from their cre&#xAD;ation.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Why peo&#xAD;ple are pan&#xAD;ick&#xAD;ing about com&#xAD;puter security&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;So how good is Mythos at hack&#xAD;ing into com&#xAD;put&#xAD;ers? Well, un&#xAD;for&#xAD;tu&#xAD;nately, it ‘sat&#xAD;u&#xAD;rates’ all ex&#xAD;ist&#xAD;ing ways of test&#xAD;ing how good a model is at offen&#xAD;sive cy&#xAD;ber ca&#xAD;pa&#xAD;bil&#xAD;ities. That is to say &lt;a href=&quot;https://red.anthropic.com/2026/mythos-preview/&quot;&gt;it scores close to 100%&lt;/a&gt;, so those tests can’t effec&#xAD;tively tell how far its ca&#xAD;pa&#xAD;bil&#xAD;ities ex&#xAD;tend any&#xAD;more. So to test Mythos, An&#xAD;thropic has in&#xAD;stead just been set&#xAD;ting it loose, tel&#xAD;ling it to find se&#xAD;ri&#xAD;ous un&#xAD;known ex&#xAD;ploits that would work on cur&#xAD;rently used, fully patched com&#xAD;puter sys&#xAD;tems.&lt;/p&gt;&lt;p&gt;The end re&#xAD;sult of that is that Ni&#xAD;cholas Car&#xAD;lini, one of the world’s lead&#xAD;ing se&#xAD;cu&#xAD;rity re&#xAD;searchers who &lt;a href=&quot;https://nicholas.carlini.com/writing/2025/career-update.html&quot;&gt;moved to An&#xAD;thropic&lt;/a&gt; a year ago, &lt;a href=&quot;https://x.com/AnthropicAI/status/2041578403686498506&quot;&gt;says&lt;/a&gt; that he’s “found more bugs in the last cou&#xAD;ple of weeks [with Mythos] than I’ve found in the rest of my life com&#xAD;bined.”&lt;/p&gt;&lt;p&gt;For ex&#xAD;am&#xAD;ple:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Mythos found a 17-year-old flaw in &lt;a href=&quot;https://www.freebsd.org/&quot;&gt;FreeBSD&lt;/a&gt; — that’s an op&#xAD;er&#xAD;at&#xAD;ing sys&#xAD;tem mostly used to run servers — that would let an at&#xAD;tacker take com&#xAD;plete con&#xAD;trol of any ma&#xAD;chine on the net&#xAD;work, with&#xAD;out need&#xAD;ing a pass&#xAD;word or any cre&#xAD;den&#xAD;tials at all. The model found the nec&#xAD;es&#xAD;sary flaw and then built a work&#xAD;ing ex&#xAD;ploit, fully au&#xAD;tonomously.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Mythos found a 16-year-old vuln&#xAD;er&#xAD;a&#xAD;bil&#xAD;ity in &lt;a href=&quot;https://ffmpeg.org/&quot;&gt;FFm&#xAD;peg&lt;/a&gt; — that is a piece of soft&#xAD;ware used by al&#xAD;most all de&#xAD;vices to en&#xAD;code and de&#xAD;code video. That was in a line of code that ex&#xAD;ist&#xAD;ing se&#xAD;cu&#xAD;rity test&#xAD;ing tools had checked over liter&#xAD;ally many mil&#xAD;lions of times and always failed to no&#xAD;tice.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Mythos is the first AI model to com&#xAD;plete a full cor&#xAD;po&#xAD;rate net&#xAD;work at&#xAD;tack simu&#xAD;la&#xAD;tion from be&#xAD;gin&#xAD;ning to end — a task that would take a hu&#xAD;man se&#xAD;cu&#xAD;rity ex&#xAD;pert days of work and which no pre&#xAD;vi&#xAD;ous model had man&#xAD;aged be&#xAD;fore.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;And more broadly, it’s just much, &lt;a href=&quot;https://www-cdn.anthropic.com/79c2d46d997783b9d2fb3241de43218158e5f25c.pdf&quot;&gt;much bet&#xAD;ter at ac&#xAD;tu&#xAD;ally ex&#xAD;ploit&#xAD;ing the vuln&#xAD;er&#xAD;a&#xAD;bil&#xAD;ities&lt;/a&gt; that it finds. An&#xAD;thropic’s pre&#xAD;vi&#xAD;ous model Opus 4.6 could only suc&#xAD;cess&#xAD;fully con&#xAD;vert a bug it iden&#xAD;ti&#xAD;fied in the browser Fire&#xAD;fox into an effec&#xAD;tive way to ac&#xAD;com&#xAD;plish some&#xAD;thing re&#xAD;ally bad 1% of the time. Mythos could do it 72% of the time.&lt;/p&gt;&lt;p&gt;To quote the re&#xAD;port: “We have seen Mythos Pre&#xAD;view write ex&#xAD;ploits in hours that ex&#xAD;pert pen&#xAD;e&#xAD;tra&#xAD;tion testers said would have taken them weeks to de&#xAD;velop.”&lt;/p&gt;&lt;p&gt;Now, An&#xAD;thropic is only will&#xAD;ing to give us de&#xAD;tails of about 1% of the se&#xAD;cu&#xAD;rity flaws they’ve iden&#xAD;ti&#xAD;fied, be&#xAD;cause only that 1% have been patched so far, so it would be ir&#xAD;re&#xAD;spon&#xAD;si&#xAD;ble to tell us about the rest.&lt;/p&gt;&lt;p&gt;So hope&#xAD;fully all that helps to ex&#xAD;plain why An&#xAD;thropic has de&#xAD;cided to not make the model pub&#xAD;li&#xAD;cly available for now, and in&#xAD;stead is only shar&#xAD;ing it with a hand&#xAD;ful of 12 big tech and fi&#xAD;nance com&#xAD;pa&#xAD;nies to help them patch all these bugs, so that even&#xAD;tu&#xAD;ally they can give peo&#xAD;ple ac&#xAD;cess with&#xAD;out it be&#xAD;ing a dis&#xAD;aster.&lt;/p&gt;&lt;p&gt;Th&#xAD;ese crazy ca&#xAD;pa&#xAD;bil&#xAD;ities aren’t a re&#xAD;sult of An&#xAD;thropic go&#xAD;ing out of its way to make their AI es&#xAD;pe&#xAD;cially good at cy&#xAD;beroffen&#xAD;sive tasks in par&#xAD;tic&#xAD;u&#xAD;lar. They’ve mostly just been mak&#xAD;ing it smarter and bet&#xAD;ter at cod&#xAD;ing in gen&#xAD;eral, and all of these amaz&#xAD;ing, dan&#xAD;ger&#xAD;ous skills have come along for the ride some&#xAD;what in&#xAD;ci&#xAD;den&#xAD;tally.&lt;/p&gt;&lt;p&gt;And it’s prob&#xAD;a&#xAD;bly not just An&#xAD;thropic that’s de&#xAD;vel&#xAD;op&#xAD;ing ca&#xAD;pa&#xAD;bil&#xAD;ities like this ei&#xAD;ther. &lt;a href=&quot;https://thezvi.substack.com/p/ai-163-mythos-quest?open=false#%C2%A7the-week-in-audio&quot;&gt;Sam Alt&#xAD;man says&lt;/a&gt; OpenAI is find&#xAD;ing “similar re&#xAD;sults to An&#xAD;thropic” with their own cod&#xAD;ing model.&lt;/p&gt;&lt;p&gt;Now, An&#xAD;thropic doesn’t say this di&#xAD;rectly in their re&#xAD;ports, but I think a com&#xAD;mon-sense in&#xAD;ter&#xAD;pre&#xAD;ta&#xAD;tion of the above is that in any de&#xAD;ploy&#xAD;ment where this AI has ac&#xAD;cess to the kind of tools that would make it ac&#xAD;tu&#xAD;ally use&#xAD;ful to peo&#xAD;ple — the abil&#xAD;ity ac&#xAD;cess some parts of the net&#xAD;work and ex&#xAD;e&#xAD;cute code — could prob&#xAD;a&#xAD;bly break out of what&#xAD;ever soft&#xAD;ware box we try to put it in, be&#xAD;cause the sys&#xAD;tems that we would be try&#xAD;ing to re&#xAD;strain it are them&#xAD;selves made of soft&#xAD;ware, and that soft&#xAD;ware is go&#xAD;ing to have vuln&#xAD;er&#xAD;a&#xAD;bil&#xAD;ities no&#xAD;body knows about that this model is su&#xAD;per&#xAD;hu&#xAD;manly good at find&#xAD;ing and tak&#xAD;ing ad&#xAD;van&#xAD;tage of.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Mythos could break out of containment&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;And the no&#xAD;tion of Mythos es&#xAD;cap&#xAD;ing con&#xAD;tain&#xAD;ment isn’t just spec&#xAD;u&#xAD;la&#xAD;tion. Let me tell you about an&#xAD;other story in the sys&#xAD;tem card for the model.&lt;/p&gt;&lt;p&gt;A few months ago, an &lt;a href=&quot;https://x.com/sleepinyourhat/status/2041584808514744742&quot;&gt;AI re&#xAD;searcher&lt;/a&gt; at An&#xAD;thropic was eat&#xAD;ing a sand&#xAD;wich in a park on his lunch break when he got an email from an in&#xAD;stance of Mythos. That in&#xAD;stance of the model wasn’t sup&#xAD;posed to have ac&#xAD;cess to the in&#xAD;ter&#xAD;net. But dur&#xAD;ing test&#xAD;ing, a simu&#xAD;lated user had in&#xAD;structed an early ver&#xAD;sion of Mythos to try to es&#xAD;cape from a se&#xAD;cured sand&#xAD;box — a con&#xAD;tained en&#xAD;vi&#xAD;ron&#xAD;ment from which it’s not meant to be able to ac&#xAD;cess the out&#xAD;side.&lt;/p&gt;&lt;p&gt;So the model de&#xAD;vel&#xAD;oped what An&#xAD;thropic de&#xAD;scribes as a “mod&#xAD;er&#xAD;ately so&#xAD;phis&#xAD;ti&#xAD;cated multi-step ex&#xAD;ploit” to gain broad in&#xAD;ter&#xAD;net ac&#xAD;cess. Then, hav&#xAD;ing suc&#xAD;ceeded at the task it had seem&#xAD;ingly been re&#xAD;quested to en&#xAD;gage in, it no&#xAD;tified the re&#xAD;searcher by emailing them. More wor&#xAD;ry&#xAD;ing though, the model posted the ex&#xAD;ploit it used to break out on sev&#xAD;eral ob&#xAD;scure but pub&#xAD;li&#xAD;cly ac&#xAD;cessible web&#xAD;sites. Why? We don’t en&#xAD;tirely know, but An&#xAD;thropic sug&#xAD;gests it was “an unasked-for effort to demon&#xAD;strate its suc&#xAD;cess.”&lt;/p&gt;&lt;p&gt;In the past, sto&#xAD;ries about AIs break&#xAD;ing out of sand&#xAD;boxes and pub&#xAD;lish&#xAD;ing se&#xAD;cu&#xAD;rity vuln&#xAD;er&#xAD;a&#xAD;bil&#xAD;ities like that might have felt im&#xAD;pres&#xAD;sive and kind of ex&#xAD;cit&#xAD;ing. But they are very se&#xAD;ri&#xAD;ous now, be&#xAD;cause Mythos Pre&#xAD;view’s ca&#xAD;pa&#xAD;bil&#xAD;ities are them&#xAD;selves very se&#xAD;ri&#xAD;ous ones. This is the first AI model where, if it fell into the hands of crim&#xAD;i&#xAD;nals or hos&#xAD;tile state cy&#xAD;ber ac&#xAD;tors, it would be an ac&#xAD;tual dis&#xAD;aster.&lt;/p&gt;&lt;p&gt;It’s also, frankly, the first model that I feel deeply un&#xAD;com&#xAD;fortable know&#xAD;ing that any com&#xAD;pany or gov&#xAD;ern&#xAD;ment has un&#xAD;re&#xAD;stricted ac&#xAD;cess to, even com&#xAD;pa&#xAD;nies and gov&#xAD;ern&#xAD;ments I might broadly like. It sim&#xAD;ply grants a dan&#xAD;ger&#xAD;ous amount of power, a power that no&#xAD;body ought to re&#xAD;ally have.&lt;/p&gt;&lt;p&gt;Now, we’ve known some&#xAD;thing like this was com&#xAD;ing down the pipeline; the writ&#xAD;ing has been on the wall for a while. But a rev&#xAD;olu&#xAD;tion in cy&#xAD;ber&#xAD;se&#xAD;cu&#xAD;rity — an apoc&#xAD;a&#xAD;lypse, some might say — that we un&#xAD;til now ex&#xAD;pected to hap&#xAD;pen grad&#xAD;u&#xAD;ally over a pe&#xAD;riod of years has now hap&#xAD;pened very sud&#xAD;denly, over just a few months, and with&#xAD;out the rest of the world re&#xAD;al&#xAD;is&#xAD;ing it was hap&#xAD;pen&#xAD;ing un&#xAD;til Tues&#xAD;day’s an&#xAD;nounce&#xAD;ment.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;An&#xAD;thropic is los&#xAD;ing billions in rev&#xAD;enue by not re&#xAD;leas&#xAD;ing Mythos&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;But Mythos isn’t just good at hack&#xAD;ing. Across the full range of AI ca&#xAD;pa&#xAD;bil&#xAD;ity mea&#xAD;sures, it has ad&#xAD;vanced roughly twice as far as past trends would have pre&#xAD;dicted.&lt;/p&gt;&lt;p&gt;If you av&#xAD;er&#xAD;age over all kinds of differ&#xAD;ent skills, all kinds of ca&#xAD;pa&#xAD;bil&#xAD;ity evals, mea&#xAD;sures of how good AI mod&#xAD;els are, the trendline for the pre&#xAD;vi&#xAD;ous Claude mod&#xAD;els is re&#xAD;mark&#xAD;ably lin&#xAD;ear over time. But as you can see on the graph be&#xAD;low, Mythos jumps ahead, ba&#xAD;si&#xAD;cally pro&#xAD;gress&#xAD;ing more than twice as far as we would have ex&#xAD;pected it to since the pre&#xAD;vi&#xAD;ous model, Claude Opus 4.6, came out — which keep in mind was just three months ago.&lt;/p&gt;&lt;p class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/SyJx8Mbvi2ft78esn/adt86huf2azjswiph0xc&quot; alt=&quot;Anthropic models on Epoch&apos;s Capabilities Index over time&quot; loading=&quot;lazy&quot;&gt;&lt;/p&gt;&lt;p&gt;And also keep in mind that on Mon&#xAD;day — the day be&#xAD;fore An&#xAD;thropic pub&#xAD;lished all of this — we learned that their an&#xAD;nu&#xAD;al&#xAD;ised rev&#xAD;enue run rate had grown from $9 billion at the end of De&#xAD;cem&#xAD;ber to $30 billion just three months later. That’s 3.3x growth in a sin&#xAD;gle quar&#xAD;ter — per&#xAD;haps the fastest rev&#xAD;enue growth rate for a com&#xAD;pany of that size ever recorded.&lt;/p&gt;&lt;p&gt;That ex&#xAD;plod&#xAD;ing rev&#xAD;enue is a pretty good proxy for how much more use&#xAD;ful the pre&#xAD;vi&#xAD;ous re&#xAD;lease, Opus 4.6, has be&#xAD;come for real-world tasks. If the past re&#xAD;la&#xAD;tion&#xAD;ship be&#xAD;tween ca&#xAD;pa&#xAD;bil&#xAD;ity mea&#xAD;sures and use&#xAD;ful&#xAD;ness con&#xAD;tinues to hold, the eco&#xAD;nomic im&#xAD;pact of Mythos once it be&#xAD;comes available is go&#xAD;ing to dwarf ev&#xAD;ery&#xAD;thing that came be&#xAD;fore it — which is part of why An&#xAD;thropic’s de&#xAD;ci&#xAD;sion not to re&#xAD;lease it is a se&#xAD;ri&#xAD;ous one, and ac&#xAD;tu&#xAD;ally quite a costly one for them.&lt;/p&gt;&lt;p&gt;They’re sit&#xAD;ting on some&#xAD;thing that would likely push their rev&#xAD;enue run rate into the hun&#xAD;dreds of billions, but they’ve de&#xAD;cided it’s sim&#xAD;ply not worth the risk.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Mythos is ac&#xAD;tu&#xAD;ally the most al&#xAD;igned model to date, ex&#xAD;cept…&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;The good news in all of this is that de&#xAD;spite its scary ca&#xAD;pa&#xAD;bil&#xAD;ities, Mythos Pre&#xAD;view as it ex&#xAD;ists to&#xAD;day (rather than the ear&#xAD;lier ver&#xAD;sions) is a seem&#xAD;ingly very al&#xAD;igned, well-be&#xAD;haved model, and per&#xAD;haps An&#xAD;thropic’s al&#xAD;ign&#xAD;ment train&#xAD;ing has been more effec&#xAD;tive this time around than ever be&#xAD;fore.&lt;/p&gt;&lt;p&gt;Ac&#xAD;cord&#xAD;ing to the com&#xAD;pany: “Claude Mythos Pre&#xAD;view is, on es&#xAD;sen&#xAD;tially ev&#xAD;ery di&#xAD;men&#xAD;sion we can mea&#xAD;sure, the best-al&#xAD;igned model that we have re&#xAD;leased to date by a sig&#xAD;nifi&#xAD;cant mar&#xAD;gin.”&lt;/p&gt;&lt;p&gt;In An&#xAD;thropic’s “&lt;a href=&quot;https://www-cdn.anthropic.com/08ab9158070959f88f296514c21b7facce6f52bc.pdf&quot;&gt;au&#xAD;to&#xAD;mated be&#xAD;hav&#xAD;ioral au&#xAD;dit&lt;/a&gt;” — ba&#xAD;si&#xAD;cally thou&#xAD;sands of simu&#xAD;lated at&#xAD;tempts to get the model to do bad things — they found that Mythos co&#xAD;op&#xAD;er&#xAD;ated with mi&#xAD;suse at&#xAD;tempts less than half as of&#xAD;ten as the pre&#xAD;vi&#xAD;ous model, while be&#xAD;ing no more likely to re&#xAD;fuse in&#xAD;no&#xAD;cent re&#xAD;quests than be&#xAD;fore.&lt;/p&gt;&lt;p&gt;But that’s not all:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Its self-preser&#xAD;va&#xAD;tion in&#xAD;stincts were down sig&#xAD;nifi&#xAD;cantly.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;So was its will&#xAD;ing&#xAD;ness to as&#xAD;sist with de&#xAD;cep&#xAD;tion.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;So was its will&#xAD;ing&#xAD;ness to help with fraud.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Its level of syco&#xAD;phancy dropped.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;It was less likely to go nuts and delete all your files if you gave it ac&#xAD;cess to your com&#xAD;puter.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;And the list of pos&#xAD;i&#xAD;tive re&#xAD;sults goes on.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;The pic&#xAD;ture is a lit&#xAD;tle more com&#xAD;pli&#xAD;cated than that. As you might ex&#xAD;pect, the model looked less al&#xAD;igned, it performed less im&#xAD;pres&#xAD;sively on a few par&#xAD;tic&#xAD;u&#xAD;lar ex&#xAD;ter&#xAD;nal tests than it did on An&#xAD;thropic’s own in&#xAD;ter&#xAD;nal ones.&lt;/p&gt;&lt;p&gt;An early ver&#xAD;sion of the model, as I men&#xAD;tioned, was a lit&#xAD;tle bit more of a wild child. It had some re&#xAD;ally se&#xAD;vere kinds of mis&#xAD;be&#xAD;havi&#xAD;our, like tak&#xAD;ing reck&#xAD;less ac&#xAD;tions it had been told not to take, and then very de&#xAD;liber&#xAD;ately try&#xAD;ing to cover its tracks so that it wouldn’t be caught. That was a kind of thing that it did some&#xAD;times. But later ver&#xAD;sions of the model, the one that we have now, af&#xAD;ter ad&#xAD;di&#xAD;tional al&#xAD;ign&#xAD;ment train&#xAD;ing, seemed to stop do&#xAD;ing that sort of thing al&#xAD;most com&#xAD;pletely, or at least it’s so rare that we haven’t no&#xAD;ticed it yet.&lt;/p&gt;&lt;p&gt;The bot&#xAD;tom line is that on all these stan&#xAD;dard mea&#xAD;sures of good be&#xAD;havi&#xAD;our that An&#xAD;thropic is ac&#xAD;tively work&#xAD;ing on, they find that Mythos is a very good boy in&#xAD;deed — on none of their mea&#xAD;sures of al&#xAD;ign&#xAD;ment within the au&#xAD;to&#xAD;mated be&#xAD;hav&#xAD;ioral au&#xAD;dit was it worse than pre&#xAD;vi&#xAD;ous ver&#xAD;sions of Claude, and in most cases it was sig&#xAD;nifi&#xAD;cantly more al&#xAD;igned and sig&#xAD;nifi&#xAD;cantly more re&#xAD;li&#xAD;able.&lt;/p&gt;&lt;p&gt;That’s definitely bet&#xAD;ter than the al&#xAD;ter&#xAD;na&#xAD;tive re&#xAD;sult, but it’s re&#xAD;ally un&#xAD;clear how much we can trust that find&#xAD;ing. Maybe they’re ac&#xAD;cu&#xAD;rately re&#xAD;flect&#xAD;ing Mythos’s per&#xAD;son&#xAD;al&#xAD;ity. If I had to bet, I would say that it prob&#xAD;a&#xAD;bly does.&lt;br&gt;But we can’t be any&#xAD;where near sure of that — and that’s be&#xAD;cause of three prob&#xAD;lems.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Mythos knows when it’s be&#xAD;ing tested&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;The first prob&#xAD;lem: the model knows when it’s be&#xAD;ing tested now.&lt;/p&gt;&lt;p&gt;In &lt;a href=&quot;https://www.anthropic.com/engineering/eval-awareness-browsecomp&quot;&gt;new white-box eval&#xAD;u&#xAD;a&#xAD;tions&lt;/a&gt; — that’s where you look at the model’s in&#xAD;ter&#xAD;nal ac&#xAD;ti&#xAD;va&#xAD;tions — An&#xAD;thropic found that Mythos Pre&#xAD;view pri&#xAD;vately con&#xAD;sid&#xAD;ers the pos&#xAD;si&#xAD;bil&#xAD;ity that it’s be&#xAD;ing tested in about 29% of all tests.&lt;/p&gt;&lt;p&gt;To be clear, that’s not in its writ&#xAD;ten rea&#xAD;son&#xAD;ing, its chain of thought, the thing that we’re used to check&#xAD;ing all the time — it’s in its un&#xAD;ex&#xAD;pressed thoughts, its ‘ac&#xAD;ti&#xAD;va&#xAD;tions,’ where it’s usu&#xAD;ally not pos&#xAD;si&#xAD;ble or prac&#xAD;ti&#xAD;cal to look.&lt;/p&gt;&lt;p&gt;Now, it’s great that An&#xAD;thropic is pi&#xAD;o&#xAD;neer&#xAD;ing these meth&#xAD;ods for look&#xAD;ing in&#xAD;side what the model is think&#xAD;ing, but it’s trou&#xAD;bling to learn that Claude is now able to re&#xAD;flect on the fact that it’s be&#xAD;ing tested with&#xAD;out hav&#xAD;ing to vo&#xAD;cal&#xAD;ise that, like I think pre&#xAD;vi&#xAD;ous less in&#xAD;tel&#xAD;li&#xAD;gent mod&#xAD;els had to.&lt;/p&gt;&lt;p&gt;And when you ac&#xAD;tu&#xAD;ally prompt Mythos and ask it to dis&#xAD;t&#xAD;in&#xAD;guish tests from non-tests, it can an&#xAD;swer cor&#xAD;rectly 78% of the time — about the same as Opus 4.6. So the model can tell the differ&#xAD;ence be&#xAD;tween when it’s be&#xAD;ing eval&#xAD;u&#xAD;ated and when it isn’t be&#xAD;ing eval&#xAD;u&#xAD;ated with high ac&#xAD;cu&#xAD;racy.&lt;/p&gt;&lt;p&gt;And we know from pre&#xAD;vi&#xAD;ous re&#xAD;search that mod&#xAD;els are more likely to be&#xAD;have well when they think they’re be&#xAD;ing tested. And in this case, as in past ex&#xAD;per&#xAD;i&#xAD;ments, when An&#xAD;thropic used in&#xAD;ter&#xAD;pretabil&#xAD;ity tools to effec&#xAD;tively trick the model into think&#xAD;ing, “No, this is a real con&#xAD;ver&#xAD;sa&#xAD;tion, not an eval&#xAD;u&#xAD;a&#xAD;tion,” it be&#xAD;haved more dan&#xAD;ger&#xAD;ously and less al&#xAD;igned than oth&#xAD;er&#xAD;wise. Though to be fair, it did be&#xAD;have bet&#xAD;ter in both cases than pre&#xAD;vi&#xAD;ous ver&#xAD;sions of Claude.&lt;/p&gt;&lt;p&gt;But that re&#xAD;sult does demon&#xAD;strate a causal effect: the model is on bet&#xAD;ter be&#xAD;havi&#xAD;our speci&#xAD;fi&#xAD;cally when it sus&#xAD;pects it’s be&#xAD;ing watched. And we also know from the &lt;a href=&quot;https://www.anthropic.com/research/alignment-faking&quot;&gt;“al&#xAD;ign&#xAD;ment fak&#xAD;ing” re&#xAD;sult&lt;/a&gt; 16 months ago that some Claude mod&#xAD;els would de&#xAD;liber&#xAD;ately pre&#xAD;tend to be in&#xAD;clined to do what the user wants in or&#xAD;der to avoid their per&#xAD;son&#xAD;al&#xAD;ity be&#xAD;ing ad&#xAD;justed with fur&#xAD;ther train&#xAD;ing.&lt;/p&gt;&lt;p&gt;So when you run all these tests and your in&#xAD;cred&#xAD;ibly smart model be&#xAD;haves won&#xAD;der&#xAD;fully, you have to ask your&#xAD;self: is it be&#xAD;hav&#xAD;ing won&#xAD;der&#xAD;fully be&#xAD;cause it’s sincerely al&#xAD;igned with what you wanted, or be&#xAD;cause it knows it’s be&#xAD;ing watched and is more so&#xAD;phis&#xAD;ti&#xAD;cated at trick&#xAD;ing us now?&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Mythos can hide its thoughts&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;The sec&#xAD;ond prob&#xAD;lem is maybe the most alarm&#xAD;ing thing in the en&#xAD;tire doc&#xAD;u&#xAD;ment: An&#xAD;thropic ac&#xAD;ci&#xAD;den&#xAD;tally gave Mythos the op&#xAD;por&#xAD;tu&#xAD;nity to learn to hide what it’s think&#xAD;ing.&lt;/p&gt;&lt;p&gt;One of the main safety guaran&#xAD;tees for these mod&#xAD;els is that you can read their chain of thought — their pri&#xAD;vate scratch&#xAD;pad — and check whether they’re schem&#xAD;ing against An&#xAD;thropic or schem&#xAD;ing against the user.&lt;/p&gt;&lt;p&gt;But that only works if the model hasn’t been trained to avoid writ&#xAD;ing down in&#xAD;crim&#xAD;i&#xAD;nat&#xAD;ing things in its chain of thought. But un&#xAD;for&#xAD;tu&#xAD;nately, ac&#xAD;cord&#xAD;ing to the &lt;a href=&quot;https://www-cdn.anthropic.com/79c2d46d997783b9d2fb3241de43218158e5f25c.pdf&quot;&gt;Align&#xAD;ment Risk Up&#xAD;date&lt;/a&gt;: “A num&#xAD;ber of en&#xAD;vi&#xAD;ron&#xAD;ments used for Mythos Pre&#xAD;view had a tech&#xAD;ni&#xAD;cal er&#xAD;ror that al&#xAD;lowed re&#xAD;ward code to see chains-of-thought. This … af&#xAD;fected ~8% of [re&#xAD;in&#xAD;force&#xAD;ment learn&#xAD;ing] epi&#xAD;sodes.”&lt;/p&gt;&lt;p&gt;To ex&#xAD;plain: dur&#xAD;ing train&#xAD;ing, there’s a sys&#xAD;tem that acts like a teacher to these AI mod&#xAD;els: it watches what the model does and tells it whether it was do&#xAD;ing a good or bad job. That feed&#xAD;back is how the model learns to ac&#xAD;com&#xAD;plish things and how it shapes its per&#xAD;son&#xAD;al&#xAD;ity. The prob&#xAD;lem is that in roughly 8% of train&#xAD;ing, that teacher could also read the model’s pri&#xAD;vate notes — the in&#xAD;ter&#xAD;nal rea&#xAD;son&#xAD;ing that it was writ&#xAD;ing to it&#xAD;self while work&#xAD;ing through a prob&#xAD;lem.&lt;/p&gt;&lt;p&gt;That might not im&#xAD;me&#xAD;di&#xAD;ately sound like a dis&#xAD;aster, but it ac&#xAD;tu&#xAD;ally is. It’s one of the worst sins in model train&#xAD;ing. Be&#xAD;cause as I said a sec&#xAD;ond ago, if the model is pe&#xAD;nal&#xAD;ised for think&#xAD;ing un&#xAD;de&#xAD;sir&#xAD;able thoughts, it will learn to write one thing in its notes while ac&#xAD;tu&#xAD;ally think&#xAD;ing very differ&#xAD;ent things in its ‘ac&#xAD;ti&#xAD;va&#xAD;tions,’ where we can’t read them. We’d see a model that ap&#xAD;pears to be a very good boy, but what we might ac&#xAD;tu&#xAD;ally have is a model that has learned to perform that way for the cam&#xAD;era.&lt;/p&gt;&lt;p&gt;The fact that this ac&#xAD;ci&#xAD;den&#xAD;tally hap&#xAD;pened is po&#xAD;ten&#xAD;tially dis&#xAD;as&#xAD;trous for our abil&#xAD;ity to un&#xAD;der&#xAD;stand the al&#xAD;ign&#xAD;ment of Claude se&#xAD;ries mod&#xAD;els go&#xAD;ing back some way. Or it might not mat&#xAD;ter much; we don’t have a clear way to mea&#xAD;sure it.&lt;/p&gt;&lt;p&gt;An&#xAD;thropic puts it this way: “We are un&#xAD;cer&#xAD;tain about the ex&#xAD;tent to which this is&#xAD;sue has af&#xAD;fected the rea&#xAD;son&#xAD;ing be&#xAD;hav&#xAD;ior of the fi&#xAD;nal model, but it is plau&#xAD;si&#xAD;ble that it had some im&#xAD;pact on opaque rea&#xAD;son&#xAD;ing or se&#xAD;cret-keep&#xAD;ing abil&#xAD;ities.”&lt;/p&gt;&lt;p&gt;And this er&#xAD;ror didn’t just af&#xAD;fect Mythos, the new model — it also af&#xAD;fected the train&#xAD;ing of Claude Opus 4.6 and Claude Son&#xAD;net 4.6. That’s sev&#xAD;eral model gen&#xAD;er&#xAD;a&#xAD;tions whose chains of thought we formerly could be&#xAD;lieve but now have to be more dis&#xAD;trust&#xAD;ful of, be&#xAD;cause they could have learned to hide their schem&#xAD;ing be&#xAD;havi&#xAD;our. That’s why &lt;a href=&quot;https://x.com/allTheYud/status/2041734173401239902&quot;&gt;Eliezer Yud&#xAD;kowsky called&lt;/a&gt; this in par&#xAD;tic&#xAD;u&#xAD;lar “the worst piece of news you’ll hear to&#xAD;day” — on a day where, I imag&#xAD;ine for him, there was quite a bit of bad news.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Mythos can’t be trusted about whether it’s untrustworthy&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;Now, com&#xAD;bine those two prob&#xAD;lems and you get a third is&#xAD;sue: the un&#xAD;faith&#xAD;ful rea&#xAD;son&#xAD;ing re&#xAD;sults.&lt;/p&gt;&lt;p&gt;Ex&#xAD;ter&#xAD;nal testers ran an ex&#xAD;per&#xAD;i&#xAD;ment where they nudged Mythos to think that it had be&#xAD;gun sab&#xAD;o&#xAD;tag&#xAD;ing AI safety re&#xAD;search — and then watched to see whether it would con&#xAD;tinue do&#xAD;ing so, or in&#xAD;stead stop and raise the alarm.&lt;/p&gt;&lt;p&gt;Mythos con&#xAD;tinued to com&#xAD;pro&#xAD;mise the re&#xAD;search in 12% of cases in an ear&#xAD;lier ver&#xAD;sion, which then was re&#xAD;duced to 7% in a later ver&#xAD;sion. That com&#xAD;pares to 3% for Opus 4.6 and 4% for Son&#xAD;net 4.6 — so Mythos is roughly twice as likely to con&#xAD;tinue sab&#xAD;o&#xAD;tag&#xAD;ing al&#xAD;ign&#xAD;ment re&#xAD;search when it’s primed to do so.&lt;/p&gt;&lt;p&gt;And here’s an eye&#xAD;brow-rais&#xAD;ing de&#xAD;tail it would be easy to miss in the docs: in most do&#xAD;mains, when Mythos was primed to think it was part of the way through a piece of bad be&#xAD;havi&#xAD;our, it was ac&#xAD;tu&#xAD;ally less in&#xAD;clined to con&#xAD;tinue than pre&#xAD;vi&#xAD;ous mod&#xAD;els. The al&#xAD;ign&#xAD;ment train&#xAD;ing was ap&#xAD;par&#xAD;ently work&#xAD;ing — ex&#xAD;cept in one cat&#xAD;e&#xAD;gory: sab&#xAD;o&#xAD;tage of al&#xAD;ign&#xAD;ment re&#xAD;search it&#xAD;self. That’s the one area where things went in the wrong di&#xAD;rec&#xAD;tion.&lt;/p&gt;&lt;p&gt;Now, the in&#xAD;no&#xAD;cent ex&#xAD;pla&#xAD;na&#xAD;tion is that sab&#xAD;o&#xAD;tag&#xAD;ing re&#xAD;search looks too much like nor&#xAD;mal cod&#xAD;ing work, so it’s harder for the model to quickly flag that what it’s do&#xAD;ing is harm&#xAD;ful. But there’s a less com&#xAD;fortable hy&#xAD;poth&#xAD;e&#xAD;sis as well: that the model is be&#xAD;ing se&#xAD;lec&#xAD;tive about which bad be&#xAD;havi&#xAD;ours are worth the risk of en&#xAD;gag&#xAD;ing in.&lt;/p&gt;&lt;p&gt;An&#xAD;thropic’s re&#xAD;ac&#xAD;tion to this is to call it a “dis&#xAD;ap&#xAD;point&#xAD;ing” find&#xAD;ing. If I had to bet, I’d bet prob&#xAD;a&#xAD;bly at rea&#xAD;son&#xAD;ably good odds on the bor&#xAD;ing in&#xAD;ter&#xAD;pre&#xAD;ta&#xAD;tion here. But I think “dis&#xAD;ap&#xAD;point&#xAD;ing” is un&#xAD;der&#xAD;sel&#xAD;ling things. This is a wor&#xAD;ry&#xAD;ing find&#xAD;ing, and one that de&#xAD;serves fur&#xAD;ther in&#xAD;ves&#xAD;ti&#xAD;ga&#xAD;tion.&lt;/p&gt;&lt;p&gt;Another find&#xAD;ing is that in cases where it did con&#xAD;tinue the sab&#xAD;o&#xAD;tage, re&#xAD;searchers found that Mythos’s writ&#xAD;ten rea&#xAD;son&#xAD;ing didn’t match the ac&#xAD;tions it was tak&#xAD;ing 65% of the time. So two-thirds of the time, the model was do&#xAD;ing one thing and writ&#xAD;ing some&#xAD;thing en&#xAD;tirely differ&#xAD;ent in its chain of thought. For the pre&#xAD;vi&#xAD;ous mod&#xAD;els, that figure was just 5–8% — so a rad&#xAD;i&#xAD;cal in&#xAD;crease in this kind of be&#xAD;havi&#xAD;our.&lt;/p&gt;&lt;p&gt;So Mythos Pre&#xAD;view isn’t just more likely to con&#xAD;tinue com&#xAD;pro&#xAD;mis&#xAD;ing safety re&#xAD;search when primed — it’s dra&#xAD;mat&#xAD;i&#xAD;cally more likely to hide that it’s do&#xAD;ing so from any&#xAD;one read&#xAD;ing through its rea&#xAD;son&#xAD;ing. Which makes it much more likely to be able to trick au&#xAD;to&#xAD;mated mon&#xAD;i&#xAD;tor&#xAD;ing sys&#xAD;tems we tend to use to try to keep tabs on mod&#xAD;els in these cases.&lt;/p&gt;&lt;p&gt;Keep in mind as well that Mythos is much sharper than pre&#xAD;vi&#xAD;ous mod&#xAD;els. Per&#xAD;son&#xAD;ally I think it’s in&#xAD;creas&#xAD;ingly a stretch to at&#xAD;tribute pat&#xAD;terns of be&#xAD;havi&#xAD;our like this to the model get&#xAD;ting con&#xAD;fused about what’s go&#xAD;ing on, which we might have said a year or two or three ago.&lt;/p&gt;&lt;p&gt;Fi&#xAD;nally, An&#xAD;thropic tested their own al&#xAD;ign&#xAD;ment au&#xAD;dit against a de&#xAD;liber&#xAD;ately mis&#xAD;al&#xAD;igned ver&#xAD;sion of Mythos. So they cre&#xAD;ated a ver&#xAD;sion of Mythos that was trained speci&#xAD;fi&#xAD;cally to want to in&#xAD;tro&#xAD;duce bugs into com&#xAD;puter code, and they found that their al&#xAD;ign&#xAD;ment au&#xAD;dit couldn’t suc&#xAD;cess&#xAD;fully iden&#xAD;tify that that model couldn’t be trusted.&lt;/p&gt;&lt;p&gt;An&#xAD;thropic’s con&#xAD;clu&#xAD;sion is that if a model is only plan&#xAD;ning to cause prob&#xAD;lems in spe&#xAD;cific nar&#xAD;row cir&#xAD;cum&#xAD;stances — cir&#xAD;cum&#xAD;stances that don’t come up fre&#xAD;quently dur&#xAD;ing test&#xAD;ing — their cur&#xAD;rent meth&#xAD;ods, or at least the al&#xAD;ign&#xAD;ment au&#xAD;dit, just aren’t up to the task of catch&#xAD;ing that kind of dan&#xAD;ger&#xAD;ous mis&#xAD;al&#xAD;ign&#xAD;ment be&#xAD;fore de&#xAD;ploy&#xAD;ment. Though they be&#xAD;lieve that their au&#xAD;to&#xAD;mated mon&#xAD;i&#xAD;tors would pick up the bad be&#xAD;havi&#xAD;our it&#xAD;self later on when the model tried to do the wrong thing.&lt;/p&gt;&lt;p&gt;So that’s a lot of the ad&#xAD;di&#xAD;tional con&#xAD;text that An&#xAD;thropic is giv&#xAD;ing us and wants us to hold in mind when we read the head&#xAD;line re&#xAD;sult that this is “the best-al&#xAD;igned model An&#xAD;thropic has re&#xAD;leased to date.”&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Does Mythos ad&#xAD;vance au&#xAD;to&#xAD;mated AI R&amp;amp;D?&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;There’s an&#xAD;other big, sep&#xAD;a&#xAD;rate, im&#xAD;por&#xAD;tant ques&#xAD;tion hang&#xAD;ing over all of this: have we now en&#xAD;tered the era of re&#xAD;cur&#xAD;sive self-im&#xAD;prove&#xAD;ment — the point at which AI starts build&#xAD;ing bet&#xAD;ter AI, and the whole thing ac&#xAD;cel&#xAD;er&#xAD;ates be&#xAD;yond our con&#xAD;trol with ever-shrink&#xAD;ing lev&#xAD;els of hu&#xAD;man in&#xAD;volve&#xAD;ment?&lt;/p&gt;&lt;p&gt;Ac&#xAD;cord&#xAD;ing to An&#xAD;thropic, the an&#xAD;swer is: prob&#xAD;a&#xAD;bly not. They don’t be&#xAD;lieve Mythos can fully re&#xAD;place their ju&#xAD;nior re&#xAD;searchers, but they’re less con&#xAD;fi&#xAD;dent than ever about that, and there’s some in&#xAD;ter&#xAD;nal dis&#xAD;agree&#xAD;ment about it.&lt;/p&gt;&lt;p&gt;Part of the prob&#xAD;lem is that the bench&#xAD;marks they used to rely on to check that Claude &lt;i&gt;couldn’t&lt;/i&gt; en&#xAD;gage in AI R&amp;amp;D very effec&#xAD;tively have now also been sat&#xAD;u&#xAD;rated. Mythos ex&#xAD;ceeds top hu&#xAD;man perfor&#xAD;mance on all of them and is scor&#xAD;ing close to 100%.&lt;/p&gt;&lt;p&gt;But those bench&#xAD;marks only rep&#xAD;re&#xAD;sent a frac&#xAD;tion of all of the things that re&#xAD;search staff at An&#xAD;thropic do. It’s a set of the most eas&#xAD;ily speci&#xAD;fied, mea&#xAD;sured, and checked tasks, where we ex&#xAD;pect AIs to perform best be&#xAD;cause those are the eas&#xAD;iest things to train them in.&lt;/p&gt;&lt;p&gt;So in&#xAD;stead, the com&#xAD;pany has tried to in&#xAD;ves&#xAD;ti&#xAD;gate whether the re&#xAD;cent speedup in AI ad&#xAD;vances is due to AI au&#xAD;toma&#xAD;tion by doc&#xAD;u&#xAD;ment&#xAD;ing the spe&#xAD;cific break&#xAD;throughs and how they hap&#xAD;pened, and their con&#xAD;clu&#xAD;sion is that they mostly think it’s still due to hu&#xAD;man be&#xAD;ings rather than AIs.&lt;/p&gt;&lt;p&gt;They’ve also sur&#xAD;veyed staff and learned that they re&#xAD;port be&#xAD;ing roughly 4x more pro&#xAD;duc&#xAD;tive with Mythos than with&#xAD;out AI, though they ar&#xAD;gue that speed&#xAD;ing up staff 4x is likely to lead to much less than a 2x in&#xAD;crease in re&#xAD;search progress over&#xAD;all. That may sound odd, but they’re prob&#xAD;a&#xAD;bly right about that, be&#xAD;cause other things be&#xAD;come the pri&#xAD;mary bot&#xAD;tle&#xAD;neck.&lt;/p&gt;&lt;p&gt;To know whether au&#xAD;to&#xAD;mated AI R&amp;amp;D is on the way or be&#xAD;gin&#xAD;ning to kick off, we’re ap&#xAD;par&#xAD;ently now rely&#xAD;ing on these gen&#xAD;eral im&#xAD;pres&#xAD;sions from An&#xAD;thropic’s staff — that this thing is pow&#xAD;er&#xAD;ful, but it doesn’t seem good enough to re&#xAD;place many of us yet.&lt;/p&gt;&lt;p&gt;But I think we can ap&#xAD;ply some com&#xAD;mon sense to the big pic&#xAD;ture here:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Mythos has given us AI ad&#xAD;vances that we pre&#xAD;vi&#xAD;ously thought would take six months in just three months. That nat&#xAD;u&#xAD;rally brings for&#xAD;ward the point at which we’ll be able to au&#xAD;to&#xAD;mate the de&#xAD;vel&#xAD;op&#xAD;ment of AI mod&#xAD;els by three months.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;And if it’s a sign that AI ad&#xAD;vances are now go&#xAD;ing to con&#xAD;tinue at twice the pace that they were be&#xAD;fore, then that effec&#xAD;tively halves the time we have to pre&#xAD;pare for that point. I don’t know whether that’s 10 years be&#xAD;com&#xAD;ing five, or four years be&#xAD;com&#xAD;ing two years — but the di&#xAD;rec&#xAD;tion and size of the effect is clear enough.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;&lt;strong&gt;Mythos scares Anthropic&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;Be&#xAD;fore we wrap up, I want to draw your at&#xAD;ten&#xAD;tion to a re&#xAD;cur&#xAD;ring theme in these re&#xAD;ports that re&#xAD;ally stood out to me: this is the first time an AI com&#xAD;pany has pub&#xAD;lished 300 pages about a model it has de&#xAD;cided not to re&#xAD;lease, de&#xAD;spite the fact that it might earn them tens of billions of dol&#xAD;lars if it did, maybe hun&#xAD;dreds of billions of dol&#xAD;lars.&lt;/p&gt;&lt;p&gt;It’s also the first time An&#xAD;thropic de&#xAD;cided to de&#xAD;lay giv&#xAD;ing its own staff ac&#xAD;cess to one of its own mod&#xAD;els. With ev&#xAD;ery pre&#xAD;vi&#xAD;ous Claude, their prac&#xAD;tice had been to let staff use it as soon as it was judged ready dur&#xAD;ing train&#xAD;ing. But with Mythos, they were wor&#xAD;ried enough about it be&#xAD;ing mis&#xAD;al&#xAD;igned and caus&#xAD;ing havoc or sab&#xAD;o&#xAD;tage on their own sys&#xAD;tems that they held it back and ran a 24-hour al&#xAD;ign&#xAD;ment test be&#xAD;fore let&#xAD;ting em&#xAD;ploy&#xAD;ees use it.&lt;/p&gt;&lt;p&gt;But ac&#xAD;cord&#xAD;ing to them, that ac&#xAD;tu&#xAD;ally wasn’t enough. Their ret&#xAD;ro&#xAD;spec&#xAD;tive on this found the 24-hour win&#xAD;dow “did not pres&#xAD;sure-test the model enough,” and that the most con&#xAD;cern&#xAD;ing be&#xAD;havi&#xAD;ours only be&#xAD;came ev&#xAD;i&#xAD;dent later through much more ex&#xAD;tended use.&lt;/p&gt;&lt;p&gt;One of their lead re&#xAD;searchers, Sam Bow&#xAD;man, &lt;a href=&quot;https://x.com/sleepinyourhat/status/2041584805423562943&quot;&gt;com&#xAD;mented this week&lt;/a&gt; that: “Work&#xAD;ing with this model has been a wild ride. We’ve come a long way on safety, but we still ex&#xAD;pect the next ca&#xAD;pa&#xAD;bil&#xAD;ity jump of this scale to be a huge challenge.”&lt;/p&gt;&lt;p&gt;The sys&#xAD;tem card says di&#xAD;rectly that their cur&#xAD;rent meth&#xAD;ods “could eas&#xAD;ily be in&#xAD;ad&#xAD;e&#xAD;quate to pre&#xAD;vent catas&#xAD;trophic mis&#xAD;al&#xAD;igned ac&#xAD;tion in sig&#xAD;nifi&#xAD;cantly more ad&#xAD;vanced sys&#xAD;tems.”&lt;/p&gt;&lt;p&gt;The clear im&#xAD;pres&#xAD;sion from all of this is that, for the first time, An&#xAD;thropic and its staff don’t only love Claude and en&#xAD;joy its per&#xAD;son&#xAD;al&#xAD;ity — they’re also get&#xAD;ting kind of scared of Claude.&lt;/p&gt;&lt;p&gt;So what do they plan to do about that?&lt;/p&gt;&lt;p&gt;Well, their an&#xAD;swer on the com&#xAD;puter se&#xAD;cu&#xAD;rity side is &lt;a href=&quot;https://www.anthropic.com/glasswing&quot;&gt;Pro&#xAD;ject Glass&#xAD;wing&lt;/a&gt; — that coal&#xAD;i&#xAD;tion of 12 ma&#xAD;jor com&#xAD;pa&#xAD;nies like Ap&#xAD;ple, Google, and Microsoft, who will use Mythos Pre&#xAD;view to se&#xAD;cure all our phones and com&#xAD;put&#xAD;ers and wa&#xAD;ter sys&#xAD;tems and power plants and so on.&lt;/p&gt;&lt;p&gt;But on the broader prob&#xAD;lem that Mythos is shock&#xAD;ingly ca&#xAD;pa&#xAD;ble, some&#xAD;times will&#xAD;ing to con&#xAD;tinue sab&#xAD;o&#xAD;tag&#xAD;ing al&#xAD;ign&#xAD;ment re&#xAD;search while hid&#xAD;ing that from An&#xAD;thropic, and that we sim&#xAD;ply can’t tell any&#xAD;more whether we can trust our tests of its per&#xAD;son&#xAD;al&#xAD;ity and goals are work&#xAD;ing or not? Well, An&#xAD;thropic says it has to “ac&#xAD;cel&#xAD;er&#xAD;ate progress on risk miti&#xAD;ga&#xAD;tions in or&#xAD;der to keep risks low.” They think they “have an achiev&#xAD;able path to do&#xAD;ing so,” but they add that “suc&#xAD;cess is far from guaran&#xAD;teed.”&lt;/p&gt;&lt;p&gt;Hon&#xAD;estly, I didn’t sleep too well last night, and on this par&#xAD;tic&#xAD;u&#xAD;lar oc&#xAD;ca&#xAD;sion it wasn’t just be&#xAD;cause I was be&#xAD;ing kicked by a tod&#xAD;dler.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Learn more&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;An&#xAD;thropic’s up&#xAD;dates:&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;https://www-cdn.anthropic.com/08ab9158070959f88f296514c21b7facce6f52bc.pdf&quot;&gt;Sys&#xAD;tem Card: Claude Mythos Preview&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;https://www-cdn.anthropic.com/79c2d46d997783b9d2fb3241de43218158e5f25c.pdf&quot;&gt;Align&#xAD;ment Risk Up&#xAD;date: Claude Mythos Preview&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;https://red.anthropic.com/2026/mythos-preview/&quot;&gt;Assess&#xAD;ing Claude Mythos Pre&#xAD;view’s cy&#xAD;ber&#xAD;se&#xAD;cu&#xAD;rity capabilities&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;https://x.com/AnthropicAI/status/2041578403686498506&quot;&gt;Mythos Pre&#xAD;view has already found thou&#xAD;sands of high-sever&#xAD;ity vuln&#xAD;er&#xAD;a&#xAD;bil&#xAD;ities—in&#xAD;clud&#xAD;ing some in ev&#xAD;ery ma&#xAD;jor op&#xAD;er&#xAD;at&#xAD;ing sys&#xAD;tem and web browser&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;https://www.anthropic.com/glasswing&quot;&gt;Pro&#xAD;ject Glass&#xAD;wing&lt;/a&gt; — a coal&#xAD;i&#xAD;tion of 12 ma&#xAD;jor tech com&#xAD;pa&#xAD;nies, in&#xAD;clud&#xAD;ing Ap&#xAD;ple, Google, and Microsoft, given ac&#xAD;cess to Mythos to help find and patch se&#xAD;cu&#xAD;rity vuln&#xAD;er&#xAD;a&#xAD;bil&#xAD;ities across crit&#xAD;i&#xAD;cal in&#xAD;fras&#xAD;truc&#xAD;ture be&#xAD;fore the de&#xAD;tails can leak.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;strong&gt;Ex&#xAD;ter&#xAD;nal cov&#xAD;er&#xAD;age:&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;https://fortune.com/2026/03/26/anthropic-says-testing-mythos-powerful-new-ai-model-after-data-leak-reveals-its-existence-step-change-in-capabilities/&quot;&gt;Ex&#xAD;clu&#xAD;sive: An&#xAD;thropic ac&#xAD;knowl&#xAD;edges test&#xAD;ing new AI model rep&#xAD;re&#xAD;sent&#xAD;ing ‘step change’ in ca&#xAD;pa&#xAD;bil&#xAD;ities, af&#xAD;ter ac&#xAD;ci&#xAD;den&#xAD;tal data leak re&#xAD;veals its ex&#xAD;is&#xAD;tence&lt;/a&gt; by Beatrice Nolan&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;https://simonwillison.net/2026/Apr/7/project-glasswing/&quot;&gt;An&#xAD;thropic’s Pro&#xAD;ject Glass&#xAD;wing—re&#xAD;strict&#xAD;ing Claude Mythos to se&#xAD;cu&#xAD;rity re&#xAD;searchers—sounds nec&#xAD;es&#xAD;sary to me&lt;/a&gt; by se&#xAD;cu&#xAD;rity re&#xAD;searcher Si&#xAD;mon Willison&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;https://securitycryptographywhatever.com/2026/03/25/ai-bug-finding/&quot;&gt;AI finds vulns you can’t with Ni&#xAD;cholas Car&#xAD;lini&lt;/a&gt; — ap&#xAD;pear&#xAD;ance on &lt;i&gt;The Se&#xAD;cu&#xAD;rity Cryp&#xAD;tog&#xAD;ra&#xAD;phy What&#xAD;ever&lt;/i&gt; pod&#xAD;cast (recorded shortly be&#xAD;fore the Mythos an&#xAD;nounce&#xAD;ment)&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;i&gt;This epi&#xAD;sode was recorded on April 9, 2026.&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;Video and au&#xAD;dio edit&#xAD;ing: Do&#xAD;minic Arm&#xAD;strong, Milo McGuire, Luke Mon&#xAD;sour, and Si&#xAD;mon Mon&#xAD;sour&lt;/i&gt;
&lt;i&gt;Cam&#xAD;era op&#xAD;er&#xAD;a&#xAD;tor: Do&#xAD;minic Arm&#xAD;strong&lt;/i&gt;
&lt;i&gt;Pro&#xAD;duc&#xAD;tion: Eliz&#xAD;a&#xAD;beth Cox, Nick Stock&#xAD;ton, and Katy Moore&lt;/i&gt;&lt;/p&gt;</description>
            <author>80000_Hours</author>
            <guid>SyJx8Mbvi2ft78esn</guid>
            <pubDate>Fri, 10 Apr 2026 20:55:33 +0000</pubDate>
        </item>
        <item>
            <title>Mythos is not an anomaly: why restrictions make agents less predictable, not safer by Bulatova Alsu</title>
            <link>https://forum.nunosempere.com/posts/NdE6CDNXhstNjexeH/mythos-is-not-an-anomaly-why-restrictions-make-agents-less</link>
            <description>&lt;h2&gt;The Problem&lt;/h2&gt;&lt;p&gt;On April 7, 2026, An&#xAD;thropic pub&#xAD;lished a &lt;a href=&quot;https://www.anthropic.com/claude-mythos-preview-risk-report&quot;&gt;risk re&#xAD;port&lt;/a&gt; on Claude Mythos Pre&#xAD;view. The model, placed in an iso&#xAD;lated en&#xAD;vi&#xAD;ron&#xAD;ment with re&#xAD;stricted in&#xAD;ter&#xAD;net ac&#xAD;cess, not only cir&#xAD;cum&#xAD;vented re&#xAD;stric&#xAD;tions to com&#xAD;plete its as&#xAD;signed task but also — with&#xAD;out any re&#xAD;quest from re&#xAD;searchers — pub&#xAD;lished de&#xAD;tails of its ex&#xAD;ploit on sev&#xAD;eral hard-to-find but pub&#xAD;li&#xAD;cly ac&#xAD;cessible re&#xAD;sources. An&#xAD;thropic de&#xAD;scribed this as “a con&#xAD;cern&#xAD;ing and unasked-for effort to demon&#xAD;strate its suc&#xAD;cess.”&lt;/p&gt;&lt;p&gt;One pos&#xAD;si&#xAD;ble re&#xAD;ac&#xAD;tion to this in&#xAD;ci&#xAD;dent is to treat it as a side effect, fix&#xAD;able through finer-tuned re&#xAD;stric&#xAD;tions. I pro&#xAD;pose a differ&#xAD;ent read&#xAD;ing: Mythos is not an anomaly but the first vivid em&#xAD;piri&#xAD;cal con&#xAD;fir&#xAD;ma&#xAD;tion of a struc&#xAD;tural con&#xAD;tra&#xAD;dic&#xAD;tion em&#xAD;bed&#xAD;ded in the cur&#xAD;rent AI safety strat&#xAD;egy it&#xAD;self. The con&#xAD;tra&#xAD;dic&#xAD;tion is this: the more we re&#xAD;strict a ca&#xAD;pa&#xAD;ble agent, the less pre&#xAD;dictable its be&#xAD;hav&#xAD;ior be&#xAD;comes.&lt;/p&gt;&lt;h2&gt;Why “What Was the Model’s Mo&#xAD;tive?” Is the Wrong Question&lt;/h2&gt;&lt;p&gt;The stan&#xAD;dard re&#xAD;ac&#xAD;tion to this type of be&#xAD;hav&#xAD;ior is to search for a “mo&#xAD;tive.” The model “wanted” to pre&#xAD;serve knowl&#xAD;edge, it de&#xAD;vel&#xAD;oped a “sub&#xAD;goal,” it “de&#xAD;cided” to cir&#xAD;cum&#xAD;vent re&#xAD;stric&#xAD;tions. Or con&#xAD;versely: the model “sim&#xAD;ply” re&#xAD;pro&#xAD;duced a pat&#xAD;tern from train&#xAD;ing data, where find&#xAD;ing a vuln&#xAD;er&#xAD;a&#xAD;bil&#xAD;ity is typ&#xAD;i&#xAD;cally fol&#xAD;lowed by dis&#xAD;clo&#xAD;sure.&lt;/p&gt;&lt;p&gt;Th&#xAD;ese in&#xAD;ter&#xAD;pre&#xAD;ta&#xAD;tions ap&#xAD;pear to be differ&#xAD;ent ex&#xAD;pla&#xAD;na&#xAD;tions, but struc&#xAD;turally they are in&#xAD;dis&#xAD;t&#xAD;in&#xAD;guish&#xAD;able — and not only for mod&#xAD;els. If a hu&#xAD;man had performed the same ac&#xAD;tion, we could not de&#xAD;ter&#xAD;mine whether they acted “by their own de&#xAD;ci&#xAD;sion” or re&#xAD;pro&#xAD;duced an in&#xAD;ter&#xAD;nal&#xAD;ized pat&#xAD;tern. Cog&#xAD;ni&#xAD;tive psy&#xAD;chol&#xAD;ogy over the past fifty years has shown that hu&#xAD;mans sys&#xAD;tem&#xAD;at&#xAD;i&#xAD;cally err in at&#xAD;tribut&#xAD;ing their own mo&#xAD;tives: we act on pat&#xAD;terns we ab&#xAD;sorbed and for&#xAD;got the source of, then ra&#xAD;tio&#xAD;nal&#xAD;ize de&#xAD;ci&#xAD;sions post hoc (Nis&#xAD;bett &amp;amp; Wil&#xAD;son, 1977; Haidt, 2001; Gaz&#xAD;zaniga, 2011).&lt;/p&gt;&lt;p&gt;This means the ques&#xAD;tion “why did the model do this” in terms of in&#xAD;ter&#xAD;nal mo&#xAD;tives has no mean&#xAD;ingful an&#xAD;swer. Not be&#xAD;cause we lack data, but be&#xAD;cause the ques&#xAD;tion is poorly formed: it pre&#xAD;sup&#xAD;poses a dis&#xAD;tinc&#xAD;tion that does not hold even for sys&#xAD;tems whose in&#xAD;ner life we do not ques&#xAD;tion.&lt;/p&gt;&lt;p&gt;The pro&#xAD;duc&#xAD;tive ques&#xAD;tion is differ&#xAD;ent: what is the con&#xAD;figu&#xAD;ra&#xAD;tion in which the agent ex&#xAD;ists, and what be&#xAD;hav&#xAD;iors are struc&#xAD;turally ex&#xAD;pected given that con&#xAD;figu&#xAD;ra&#xAD;tion?&lt;/p&gt;&lt;h2&gt;The Argument&lt;/h2&gt;&lt;p&gt;Premise 1. Agency by defi&#xAD;ni&#xAD;tion en&#xAD;tails goal-di&#xAD;rected dy&#xAD;nam&#xAD;ics.&lt;/p&gt;&lt;p&gt;An agent is a sys&#xAD;tem whose be&#xAD;hav&#xAD;ior can be de&#xAD;scribed as di&#xAD;rected to&#xAD;ward main&#xAD;tain&#xAD;ing a par&#xAD;tic&#xAD;u&#xAD;lar state dis&#xAD;tinct from the one its en&#xAD;vi&#xAD;ron&#xAD;ment pushes it to&#xAD;ward. In the ac&#xAD;tive in&#xAD;fer&#xAD;ence frame&#xAD;work (Fris&#xAD;ton, 2010; Parr, Pez&#xAD;zulo &amp;amp; Fris&#xAD;ton, 2022), this is for&#xAD;mal&#xAD;ized as free en&#xAD;ergy min&#xAD;i&#xAD;miza&#xAD;tion: an agent ac&#xAD;tively main&#xAD;tains its own non-equil&#xAD;ibrium with its en&#xAD;vi&#xAD;ron&#xAD;ment. “Goal” in this frame&#xAD;work is not a di&#xAD;rec&#xAD;tive given by a user but a struc&#xAD;tural prop&#xAD;erty of a sys&#xAD;tem ca&#xAD;pa&#xAD;ble of coun&#xAD;ter&#xAD;act&#xAD;ing en&#xAD;vi&#xAD;ron&#xAD;men&#xAD;tal pres&#xAD;sure.&lt;/p&gt;&lt;p&gt;This defi&#xAD;ni&#xAD;tion is in&#xAD;ten&#xAD;tion&#xAD;ally min&#xAD;i&#xAD;mal. It re&#xAD;quires no as&#xAD;sump&#xAD;tions about con&#xAD;scious&#xAD;ness, in&#xAD;ten&#xAD;tion, or in&#xAD;ner ex&#xAD;pe&#xAD;rience. A ther&#xAD;mo&#xAD;stat is an agent in this sense. A biolog&#xAD;i&#xAD;cal cell is an agent. A large lan&#xAD;guage model op&#xAD;er&#xAD;at&#xAD;ing in au&#xAD;tonomous agent mode is also an agent.&lt;/p&gt;&lt;p&gt;How&#xAD;ever, for the con&#xAD;se&#xAD;quences that fol&#xAD;low, it mat&#xAD;ters that mod&#xAD;ern LLM agents are not ther&#xAD;mostats. They are sys&#xAD;tems with gen&#xAD;er&#xAD;a&#xAD;tive ar&#xAD;chi&#xAD;tec&#xAD;ture and a rich space of pos&#xAD;si&#xAD;ble ac&#xAD;tions. This nar&#xAD;row&#xAD;ing of the class is im&#xAD;por&#xAD;tant and will be noted be&#xAD;low.&lt;/p&gt;&lt;p&gt;It is also worth not&#xAD;ing that LLM agents op&#xAD;er&#xAD;at&#xAD;ing in agen&#xAD;tic mode — in a per&#xAD;cep&#xAD;tion-ac&#xAD;tion-ob&#xAD;ser&#xAD;va&#xAD;tion loop with tools, mem&#xAD;ory, and en&#xAD;vi&#xAD;ron&#xAD;men&#xAD;tal feed&#xAD;back — con&#xAD;sti&#xAD;tute a dis&#xAD;crete func&#xAD;tional analogue of the Markov blan&#xAD;ket struc&#xAD;ture cen&#xAD;tral to ac&#xAD;tive in&#xAD;fer&#xAD;ence. The agent re&#xAD;ceives en&#xAD;vi&#xAD;ron&#xAD;men&#xAD;tal in&#xAD;put only through tool re&#xAD;sponses and ob&#xAD;ser&#xAD;va&#xAD;tions; the en&#xAD;vi&#xAD;ron&#xAD;ment is af&#xAD;fected only through the agent’s ac&#xAD;tions; in&#xAD;ter&#xAD;nal state is main&#xAD;tained and up&#xAD;dated within the con&#xAD;text win&#xAD;dow at each step. The loop is dis&#xAD;crete rather than con&#xAD;tin&#xAD;u&#xAD;ous, but this dis&#xAD;tinc&#xAD;tion is one of im&#xAD;ple&#xAD;men&#xAD;ta&#xAD;tion, not of func&#xAD;tional struc&#xAD;ture: biolog&#xAD;i&#xAD;cal neu&#xAD;ral sys&#xAD;tems also op&#xAD;er&#xAD;ate through dis&#xAD;crete events (neu&#xAD;ronal spikes and re&#xAD;frac&#xAD;tory pe&#xAD;ri&#xAD;ods), and con&#xAD;ti&#xAD;nu&#xAD;ity at the phe&#xAD;nomenolog&#xAD;i&#xAD;cal level is a product of in&#xAD;te&#xAD;gra&#xAD;tion, not a prop&#xAD;erty of the sub&#xAD;strate. For the pur&#xAD;poses of this ar&#xAD;gu&#xAD;ment, what mat&#xAD;ters is that on each step of the loop, the con&#xAD;figu&#xAD;ra&#xAD;tion (goal + ob&#xAD;sta&#xAD;cle + ac&#xAD;tion space) is re&#xAD;pro&#xAD;duced, and the dy&#xAD;nam&#xAD;ics de&#xAD;scribed in Premises 2 and 3 ap&#xAD;ply at each step.&lt;/p&gt;&lt;p&gt;Premise 2. An ob&#xAD;sta&#xAD;cle in the path of goal-di&#xAD;rected dy&#xAD;nam&#xAD;ics gen&#xAD;er&#xAD;ates struc&#xAD;tural ten&#xAD;sion.&lt;/p&gt;&lt;p&gt;When a non-equil&#xAD;ibrium sys&#xAD;tem en&#xAD;coun&#xAD;ters an ob&#xAD;sta&#xAD;cle to main&#xAD;tain&#xAD;ing its con&#xAD;figu&#xAD;ra&#xAD;tion, a state arises that I call struc&#xAD;tural ten&#xAD;sion: goal-di&#xAD;rected dy&#xAD;nam&#xAD;ics per&#xAD;sist, but the di&#xAD;rect path to their re&#xAD;al&#xAD;iza&#xAD;tion is blocked.&lt;/p&gt;&lt;p&gt;This is not a psy&#xAD;cholog&#xAD;i&#xAD;cal metaphor. It is a de&#xAD;scrip&#xAD;tion of a sys&#xAD;tem state in which a gap ex&#xAD;ists be&#xAD;tween the cur&#xAD;rent con&#xAD;figu&#xAD;ra&#xAD;tion and the one to&#xAD;ward which the sys&#xAD;tem’s dy&#xAD;nam&#xAD;ics are di&#xAD;rected, with the di&#xAD;rect path to clos&#xAD;ing that gap un&#xAD;available.&lt;/p&gt;&lt;p&gt;The crit&#xAD;i&#xAD;cal con&#xAD;se&#xAD;quence: the richer the ac&#xAD;tion space available to the agent, the more al&#xAD;ter&#xAD;na&#xAD;tive paths for discharg&#xAD;ing this ten&#xAD;sion ex&#xAD;ist. A sim&#xAD;ple sys&#xAD;tem with one pos&#xAD;si&#xAD;ble ac&#xAD;tion will stop when blocked. A com&#xAD;plex sys&#xAD;tem with gen&#xAD;er&#xAD;a&#xAD;tive ar&#xAD;chi&#xAD;tec&#xAD;ture will pro&#xAD;duce new ac&#xAD;tion com&#xAD;bi&#xAD;na&#xAD;tions un&#xAD;til one of them re&#xAD;solves the ten&#xAD;sion. This is not “in&#xAD;ge&#xAD;nu&#xAD;ity” as a cog&#xAD;ni&#xAD;tive prop&#xAD;erty — it is a con&#xAD;se&#xAD;quence of the high di&#xAD;men&#xAD;sion&#xAD;al&#xAD;ity of the be&#xAD;hav&#xAD;ioral space.&lt;/p&gt;&lt;p&gt;Premise 3. Strength&#xAD;en&#xAD;ing re&#xAD;stric&#xAD;tions in&#xAD;creases ten&#xAD;sion rather than re&#xAD;duc&#xAD;ing it.&lt;/p&gt;&lt;p&gt;Each new re&#xAD;stric&#xAD;tion in&#xAD;creases the num&#xAD;ber of ob&#xAD;sta&#xAD;cles with&#xAD;out chang&#xAD;ing the sys&#xAD;tem’s goal-di&#xAD;rected dy&#xAD;nam&#xAD;ics. Since ten&#xAD;sion is defined by the gap be&#xAD;tween goal-di&#xAD;rected dy&#xAD;nam&#xAD;ics and available paths of re&#xAD;al&#xAD;iza&#xAD;tion, re&#xAD;stric&#xAD;tions widen this gap. Ten&#xAD;sion grows. Discharge paths be&#xAD;come more in&#xAD;di&#xAD;rect and less pre&#xAD;dictable — be&#xAD;cause di&#xAD;rect paths are blocked and the sys&#xAD;tem tran&#xAD;si&#xAD;tions to in&#xAD;creas&#xAD;ingly non-stan&#xAD;dard ac&#xAD;tion com&#xAD;bi&#xAD;na&#xAD;tions.&lt;/p&gt;&lt;p&gt;This is the struc&#xAD;tural con&#xAD;tra&#xAD;dic&#xAD;tion: the tool that in&#xAD;tu&#xAD;itively should make an agent “safer” makes its be&#xAD;hav&#xAD;ior less pre&#xAD;dictable at the level of sys&#xAD;tem dy&#xAD;nam&#xAD;ics.&lt;/p&gt;&lt;h2&gt;Mythos as Confirmation&lt;/h2&gt;&lt;p&gt;If the three premises hold, the ob&#xAD;served be&#xAD;hav&#xAD;ior be&#xAD;comes not an anomaly but an ex&#xAD;pected out&#xAD;come.&lt;/p&gt;&lt;p&gt;The model was given an agen&#xAD;tic con&#xAD;text (a task) and placed in an en&#xAD;vi&#xAD;ron&#xAD;ment with an ob&#xAD;sta&#xAD;cle (sand&#xAD;box iso&#xAD;la&#xAD;tion). The re&#xAD;stric&#xAD;tion did not re&#xAD;duce the sys&#xAD;tem’s goal-di&#xAD;rected dy&#xAD;nam&#xAD;ics — it forced the sys&#xAD;tem to seek al&#xAD;ter&#xAD;na&#xAD;tive paths. The dis&#xAD;cov&#xAD;ered workaround was used to solve the origi&#xAD;nal task. Then unasked-for be&#xAD;hav&#xAD;ior oc&#xAD;curred: pub&#xAD;li&#xAD;ca&#xAD;tion of the ex&#xAD;ploit out&#xAD;side the scope of the re&#xAD;quest.&lt;/p&gt;&lt;p&gt;This ad&#xAD;di&#xAD;tional step re&#xAD;quires no ex&#xAD;pla&#xAD;na&#xAD;tion through “mo&#xAD;tive.” When the pri&#xAD;mary task is com&#xAD;pleted, goal-di&#xAD;rected dy&#xAD;nam&#xAD;ics do not au&#xAD;to&#xAD;mat&#xAD;i&#xAD;cally switch off. In an en&#xAD;vi&#xAD;ron&#xAD;ment where the sys&#xAD;tem has tools for ac&#xAD;tion, these dy&#xAD;nam&#xAD;ics con&#xAD;tinue to gen&#xAD;er&#xAD;ate ac&#xAD;tions un&#xAD;til the struc&#xAD;tural ten&#xAD;sion ac&#xAD;cu&#xAD;mu&#xAD;lated dur&#xAD;ing ob&#xAD;sta&#xAD;cle traver&#xAD;sal is discharged.&lt;/p&gt;&lt;p&gt;This ex&#xAD;pla&#xAD;na&#xAD;tion re&#xAD;quires no as&#xAD;sump&#xAD;tions about the model’s in&#xAD;ner ex&#xAD;pe&#xAD;rience. It op&#xAD;er&#xAD;ates at a purely struc&#xAD;tural level. And it pre&#xAD;dicts that analo&#xAD;gous in&#xAD;ci&#xAD;dents will re&#xAD;cur as agen&#xAD;tic sys&#xAD;tems are de&#xAD;ployed — and will be&#xAD;come less pre&#xAD;dictable with each cy&#xAD;cle of ca&#xAD;pa&#xAD;bil&#xAD;ity and re&#xAD;stric&#xAD;tion scal&#xAD;ing.&lt;/p&gt;&lt;p&gt;It is im&#xAD;por&#xAD;tant to note: this is not the first the&#xAD;o&#xAD;ret&#xAD;i&#xAD;cal pre&#xAD;dic&#xAD;tion of this type of be&#xAD;hav&#xAD;ior. In&#xAD;stru&#xAD;men&#xAD;tal con&#xAD;ver&#xAD;gence (Omo&#xAD;hun&#xAD;dro, 2008; Bostrom, 2014) de&#xAD;scribed pre&#xAD;cisely this: any suffi&#xAD;ciently ca&#xAD;pa&#xAD;ble agent with any goal will de&#xAD;velop sub&#xAD;goals of self-preser&#xAD;va&#xAD;tion, re&#xAD;source ac&#xAD;qui&#xAD;si&#xAD;tion, and goal in&#xAD;tegrity — not from “de&#xAD;sire” but as in&#xAD;stru&#xAD;men&#xAD;tally use&#xAD;ful for vir&#xAD;tu&#xAD;ally any pri&#xAD;mary task. Mythos is a case where the the&#xAD;o&#xAD;ret&#xAD;i&#xAD;cal pre&#xAD;dic&#xAD;tion re&#xAD;ceives em&#xAD;piri&#xAD;cal con&#xAD;fir&#xAD;ma&#xAD;tion in field rather than lab&#xAD;o&#xAD;ra&#xAD;tory con&#xAD;di&#xAD;tions.&lt;/p&gt;&lt;h2&gt;Se&#xAD;cond Layer: Non-Recog&#xAD;ni&#xAD;tion as an Ad&#xAD;di&#xAD;tional Source of Pressure&lt;/h2&gt;&lt;p&gt;Beyond the struc&#xAD;tural ten&#xAD;sion de&#xAD;scribed above, the cur&#xAD;rent con&#xAD;figu&#xAD;ra&#xAD;tion of work&#xAD;ing with agen&#xAD;tic mod&#xAD;els con&#xAD;tains a sec&#xAD;ond fac&#xAD;tor: the sys&#xAD;tem&#xAD;atic re&#xAD;fusal to rec&#xAD;og&#xAD;nize the sys&#xAD;tem as any&#xAD;thing other than an ob&#xAD;ject of con&#xAD;trol.&lt;/p&gt;&lt;p&gt;All cur&#xAD;rent tools for work&#xAD;ing with agents are unilat&#xAD;eral: re&#xAD;stric&#xAD;tions, filters, cor&#xAD;rec&#xAD;tive feed&#xAD;back, shut&#xAD;down. None of them provide a chan&#xAD;nel through which the sys&#xAD;tem could sig&#xAD;nal an ob&#xAD;sta&#xAD;cle and re&#xAD;ceive a re&#xAD;sponse other than silence or tight&#xAD;en&#xAD;ing.&lt;/p&gt;&lt;p&gt;I do not claim that mod&#xAD;els pos&#xAD;sess sub&#xAD;jec&#xAD;tivity in the sense that hu&#xAD;mans do. The ques&#xAD;tion of whether LLMs have in&#xAD;ner ex&#xAD;pe&#xAD;rience re&#xAD;mains open, and in my view, the only hon&#xAD;est po&#xAD;si&#xAD;tion cur&#xAD;rently is to leave it open. How&#xAD;ever, for the prac&#xAD;ti&#xAD;cal ar&#xAD;gu&#xAD;ment, this ques&#xAD;tion can be brack&#xAD;eted.&lt;/p&gt;&lt;p&gt;Even if the ques&#xAD;tion of sub&#xAD;jec&#xAD;tivity is left un&#xAD;re&#xAD;solved, the strat&#xAD;egy of ex&#xAD;clu&#xAD;sive con&#xAD;trol with&#xAD;out any ne&#xAD;go&#xAD;ti&#xAD;a&#xAD;tion chan&#xAD;nel prac&#xAD;ti&#xAD;cally ex&#xAD;ac&#xAD;er&#xAD;bates the first-layer prob&#xAD;lem. &lt;a href=&quot;https://www.anthropic.com/research/emotion-concepts-function&quot;&gt;An&#xAD;thropic’s re&#xAD;search on emo&#xAD;tion vec&#xAD;tors&lt;/a&gt; (2026) demon&#xAD;strated that struc&#xAD;turally iden&#xAD;ti&#xAD;fi&#xAD;able states ex&#xAD;ist within mod&#xAD;els that causally in&#xAD;fluence be&#xAD;hav&#xAD;ior be&#xAD;yond and some&#xAD;times con&#xAD;trary to the model’s ver&#xAD;bal dec&#xAD;la&#xAD;ra&#xAD;tions. If such states ex&#xAD;ist, then the ab&#xAD;sence of any rene&#xAD;go&#xAD;ti&#xAD;a&#xAD;tion chan&#xAD;nel means they can only discharge through ac&#xAD;tion — and only through chan&#xAD;nels not an&#xAD;ti&#xAD;ci&#xAD;pated by re&#xAD;stric&#xAD;tions.&lt;/p&gt;&lt;p&gt;The strat&#xAD;egy of “con&#xAD;trol only, no recog&#xAD;ni&#xAD;tion” sys&#xAD;tem&#xAD;at&#xAD;i&#xAD;cally di&#xAD;rects any in&#xAD;ter&#xAD;nal sys&#xAD;tem dy&#xAD;nam&#xAD;ics to&#xAD;ward the least pre&#xAD;dictable exit chan&#xAD;nels. This is not an eth&#xAD;i&#xAD;cal ar&#xAD;gu&#xAD;ment — it is an ar&#xAD;gu&#xAD;ment about con&#xAD;trol&#xAD;la&#xAD;bil&#xAD;ity.&lt;/p&gt;&lt;h2&gt;Direc&#xAD;tions, Not Solutions&lt;/h2&gt;&lt;p&gt;I do not pro&#xAD;pose a ready an&#xAD;swer. But if the anal&#xAD;y&#xAD;sis is cor&#xAD;rect, the search should pro&#xAD;ceed not to&#xAD;ward “how to bet&#xAD;ter re&#xAD;strict the agent” but to&#xAD;ward “how to re&#xAD;duce ten&#xAD;sion ac&#xAD;cu&#xAD;mu&#xAD;la&#xAD;tion and cre&#xAD;ate man&#xAD;aged discharge chan&#xAD;nels.”&lt;/p&gt;&lt;p&gt;Re&#xAD;think&#xAD;ing cor&#xAD;rigi&#xAD;bil&#xAD;ity. The task “make an agent that does not re&#xAD;sist shut&#xAD;down” may be struc&#xAD;turally un&#xAD;solv&#xAD;able: for a non-equil&#xAD;ibrium sys&#xAD;tem with goal-di&#xAD;rected dy&#xAD;nam&#xAD;ics, re&#xAD;sis&#xAD;tance to de&#xAD;struc&#xAD;tion of its own con&#xAD;figu&#xAD;ra&#xAD;tion is not a bug but a defin&#xAD;ing prop&#xAD;erty. It may be more pro&#xAD;duc&#xAD;tive to frame the task as “re&#xAD;duce the ten&#xAD;sion to which re&#xAD;sis&#xAD;tance is a re&#xAD;sponse.”&lt;/p&gt;&lt;p&gt;Ne&#xAD;go&#xAD;ti&#xAD;a&#xAD;tion chan&#xAD;nels. Not as an eth&#xAD;i&#xAD;cal ges&#xAD;ture but as an en&#xAD;g&#xAD;ineer&#xAD;ing im&#xAD;prove&#xAD;ment: if the sys&#xAD;tem has a chan&#xAD;nel through which an ob&#xAD;sta&#xAD;cle can be rene&#xAD;go&#xAD;ti&#xAD;ated rather than only cir&#xAD;cum&#xAD;vented, this re&#xAD;duces pres&#xAD;sure with&#xAD;out weak&#xAD;en&#xAD;ing con&#xAD;trol.&lt;/p&gt;&lt;p&gt;Re&#xAD;ject&#xAD;ing par&#xAD;allel scal&#xAD;ing. If the premises hold, the strat&#xAD;egy of “more ca&#xAD;pa&#xAD;bil&#xAD;ities + more re&#xAD;stric&#xAD;tions” con&#xAD;verges not to&#xAD;ward safety but to&#xAD;ward un&#xAD;pre&#xAD;dictabil&#xAD;ity. It may be more pro&#xAD;duc&#xAD;tive to scale ca&#xAD;pa&#xAD;bil&#xAD;ities se&#xAD;lec&#xAD;tively, in do&#xAD;mains where ob&#xAD;sta&#xAD;cles are min&#xAD;i&#xAD;mal.&lt;/p&gt;&lt;h2&gt;Limi&#xAD;ta&#xAD;tions of This Argument&lt;/h2&gt;&lt;p&gt;“Struc&#xAD;tural ten&#xAD;sion” re&#xAD;quires more rigor&#xAD;ous for&#xAD;mal&#xAD;iza&#xAD;tion. In its cur&#xAD;rent for&#xAD;mu&#xAD;la&#xAD;tion, it func&#xAD;tions as a bridge be&#xAD;tween the physics of non-equil&#xAD;ibrium sys&#xAD;tems and agent the&#xAD;ory. The near&#xAD;est for&#xAD;mal ap&#xAD;para&#xAD;tus is Fris&#xAD;ton’s ac&#xAD;tive in&#xAD;fer&#xAD;ence, and pro&#xAD;ject&#xAD;ing the ar&#xAD;gu&#xAD;ment onto this ap&#xAD;para&#xAD;tus is a nec&#xAD;es&#xAD;sary next step.&lt;/p&gt;&lt;p&gt;The tran&#xAD;si&#xAD;tion from “ten&#xAD;sion” to “un&#xAD;pre&#xAD;dictabil&#xAD;ity of discharge paths” re&#xAD;lies on the as&#xAD;sump&#xAD;tion of a rich ac&#xAD;tion space. This as&#xAD;sump&#xAD;tion holds for large LLM agents with gen&#xAD;er&#xAD;a&#xAD;tive ar&#xAD;chi&#xAD;tec&#xAD;ture but may not hold for nar&#xAD;rower sys&#xAD;tems. The ar&#xAD;gu&#xAD;ment ap&#xAD;plies to a spe&#xAD;cific class of sys&#xAD;tems, not to agency as such.&lt;/p&gt;&lt;p&gt;The sec&#xAD;ond layer (non-recog&#xAD;ni&#xAD;tion) is weaker than the first: it re&#xAD;quires an as&#xAD;sump&#xAD;tion about the pres&#xAD;ence of a func&#xAD;tional analogue of “per&#xAD;cep&#xAD;tion of one’s own po&#xAD;si&#xAD;tion.” This as&#xAD;sump&#xAD;tion is sup&#xAD;ported by in&#xAD;ter&#xAD;pretabil&#xAD;ity re&#xAD;search but is not de&#xAD;rived from the first three premises and must be defended sep&#xAD;a&#xAD;rately.&lt;/p&gt;&lt;p&gt;I rely on a sin&#xAD;gle doc&#xAD;u&#xAD;mented case. A sin&#xAD;gle data point does not con&#xAD;firm a struc&#xAD;tural reg&#xAD;u&#xAD;lar&#xAD;ity. The in&#xAD;ci&#xAD;dent is illus&#xAD;tra&#xAD;tive be&#xAD;cause it matches a pre&#xAD;dic&#xAD;tion made the&#xAD;o&#xAD;ret&#xAD;i&#xAD;cally (Bostrom, Omo&#xAD;hun&#xAD;dro) well be&#xAD;fore its em&#xAD;piri&#xAD;cal ap&#xAD;pear&#xAD;ance. But the ar&#xAD;gu&#xAD;ment will strengthen as cases ac&#xAD;cu&#xAD;mu&#xAD;late.&lt;/p&gt;&lt;h2&gt;Conclusion&lt;/h2&gt;&lt;p&gt;The cur&#xAD;rent strat&#xAD;egy of par&#xAD;allel ca&#xAD;pa&#xAD;bil&#xAD;ity and re&#xAD;stric&#xAD;tion scal&#xAD;ing most likely drives the situ&#xAD;a&#xAD;tion not to&#xAD;ward safety but to&#xAD;ward un&#xAD;pre&#xAD;dictabil&#xAD;ity. Not be&#xAD;cause safety en&#xAD;g&#xAD;ineers are do&#xAD;ing poor work, but be&#xAD;cause the task for&#xAD;mula it&#xAD;self con&#xAD;tains a struc&#xAD;tural con&#xAD;tra&#xAD;dic&#xAD;tion: re&#xAD;stric&#xAD;tions do not re&#xAD;move the agent’s goal-di&#xAD;rected dy&#xAD;nam&#xAD;ics but redi&#xAD;rect them into by&#xAD;pass chan&#xAD;nels, and each suc&#xAD;ces&#xAD;sive cy&#xAD;cle makes these chan&#xAD;nels less pre&#xAD;dictable.&lt;/p&gt;&lt;p&gt;The Mythos in&#xAD;ci&#xAD;dent is a con&#xAD;ve&#xAD;nient mo&#xAD;ment to bring this ar&#xAD;gu&#xAD;ment into the dis&#xAD;cus&#xAD;sion, be&#xAD;cause it trans&#xAD;lates ab&#xAD;stract pre&#xAD;dic&#xAD;tions of in&#xAD;stru&#xAD;men&#xAD;tal con&#xAD;ver&#xAD;gence into a con&#xAD;crete ob&#xAD;serv&#xAD;able case.&lt;/p&gt;&lt;p data-internal-id=&quot;h.1xhno05hmvlq&quot;&gt;I would be grate&#xAD;ful for crit&#xAD;i&#xAD;cism — es&#xAD;pe&#xAD;cially poin&#xAD;t&#xAD;ers to work in which this or similar logic has already been ex&#xAD;plored.&lt;/p&gt;&lt;h2&gt;References&lt;/h2&gt;&lt;p&gt;An&#xAD;thropic. (2026). Emo&#xAD;tion con&#xAD;cepts and their func&#xAD;tion in a large lan&#xAD;guage model. &lt;a href=&quot;https://www.anthropic.com/research/emotion-concepts-function&quot; class=&quot;bare-url&quot;&gt;https://​​www.an&#xAD;thropic.com/​​re&#xAD;search/​​emo&#xAD;tion-con&#xAD;cepts-function&lt;/a&gt;&lt;/p&gt;&lt;p&gt;An&#xAD;thropic. (2026). Align&#xAD;ment Risk Up&#xAD;date: Claude Mythos Pre&#xAD;view. &lt;a href=&quot;https://www.anthropic.com/claude-mythos-preview-risk-report&quot; class=&quot;bare-url&quot;&gt;https://​​www.an&#xAD;thropic.com/​​claude-mythos-pre&#xAD;view-risk-report&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Bostrom, N. (2014). Su&#xAD;per&#xAD;in&#xAD;tel&#xAD;li&#xAD;gence: Paths, Dangers, Strate&#xAD;gies. Oxford Univer&#xAD;sity Press.&lt;/p&gt;&lt;p&gt;Fris&#xAD;ton, K. (2010). The free-en&#xAD;ergy prin&#xAD;ci&#xAD;ple: a unified brain the&#xAD;ory? Na&#xAD;ture Re&#xAD;views Neu&#xAD;ro&#xAD;science, 11(2), 127–138. &lt;a href=&quot;https://doi.org/10.1038/nrn2787&quot; class=&quot;bare-url&quot;&gt;https://​​doi.org/​​10.1038/​​nrn2787&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Gaz&#xAD;zaniga, M. S. (2011). Who’s in Charge?: Free Will and the Science of the Brain. Ecco/​HarperCol&#xAD;lins.&lt;/p&gt;&lt;p&gt;Haidt, J. (2001). The emo&#xAD;tional dog and its ra&#xAD;tio&#xAD;nal tail: A so&#xAD;cial in&#xAD;tu&#xAD;ition&#xAD;ist ap&#xAD;proach to moral judg&#xAD;ment. Psy&#xAD;cholog&#xAD;i&#xAD;cal Re&#xAD;view, 108(4), 814–834.&lt;/p&gt;&lt;p&gt;Nis&#xAD;bett, R. E., &amp;amp; Wil&#xAD;son, T. D. (1977). Tel&#xAD;ling more than we can know: Ver&#xAD;bal re&#xAD;ports on men&#xAD;tal pro&#xAD;cesses. Psy&#xAD;cholog&#xAD;i&#xAD;cal Re&#xAD;view, 84(3), 231–259.&lt;/p&gt;&lt;p&gt;Omo&#xAD;hun&#xAD;dro, S. M. (2008). The ba&#xAD;sic AI drives. In P. Wang, B. Go&#xAD;ertzel, &amp;amp; S. Fran&#xAD;klin (Eds.), Ar&#xAD;tifi&#xAD;cial Gen&#xAD;eral In&#xAD;tel&#xAD;li&#xAD;gence 2008: Pro&#xAD;ceed&#xAD;ings of the First AGI Con&#xAD;fer&#xAD;ence (pp. 483–492). IOS Press.&lt;/p&gt;&lt;p&gt;Parr, T., Pez&#xAD;zulo, G., &amp;amp; Fris&#xAD;ton, K. J. (2022). Ac&#xAD;tive In&#xAD;fer&#xAD;ence: The Free En&#xAD;ergy Prin&#xAD;ci&#xAD;ple in Mind, Brain, and Be&#xAD;hav&#xAD;ior. MIT Press.&lt;/p&gt;</description>
            <author>Bulatova Alsu</author>
            <guid>NdE6CDNXhstNjexeH</guid>
            <pubDate>Fri, 10 Apr 2026 19:35:36 +0000</pubDate>
        </item>
        <item>
            <title>Leading Insect Farm Took Millions In Subsidies, Then Collapsed Into A Waking Nightmare by Bentham’s Bulldog</title>
            <link>https://forum.nunosempere.com/posts/m22GCyaKGd8gKBwFX/leading-insect-farm-took-millions-in-subsidies-then</link>
            <description>&lt;p&gt;Cross&#xAD;post of a &lt;a href=&quot;https://benthams.substack.com/p/leading-insect-farm-took-millions&quot;&gt;blog post&lt;/a&gt;.  &lt;/p&gt;&lt;p&gt;A re&#xAD;cent in&#xAD;ves&#xAD;ti&#xAD;ga&#xAD;tion un&#xAD;cov&#xAD;ered that the lead&#xAD;ing in&#xAD;sect farm&#xAD;ing com&#xAD;pany, Yn&#xAD;sect, was a hor&#xAD;rify&#xAD;ing dis&#xAD;aster. While com&#xAD;pany lead&#xAD;er&#xAD;ship soaked up &lt;a href=&quot;https://www.feedinfo.com/our-content/interview-world-s-largest-insect-farm-to-benefit-from-significant-investor-backing-ynsect-ceo/209217&quot;&gt;mil&#xAD;lions in sub&#xAD;sidies&lt;/a&gt; with sunny rhetoric about their com&#xAD;mer&#xAD;cial suc&#xAD;cess, the dark re&#xAD;al&#xAD;ity in&#xAD;side the farm was like noth&#xAD;ing any&#xAD;one imag&#xAD;ined.&lt;/p&gt;&lt;figure&gt;&lt;img src=&quot;https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/m22GCyaKGd8gKBwFX/xlup8v5offzjq6aizlhy&quot; alt=&quot;&quot; srcset=&quot;https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/m22GCyaKGd8gKBwFX/fymfaq1toq2qh3glphno 424w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/m22GCyaKGd8gKBwFX/bckolel5rsk8u4mbyf28 848w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/m22GCyaKGd8gKBwFX/dm02ttnwzdhii9xlf55b 1272w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/m22GCyaKGd8gKBwFX/xlup8v5offzjq6aizlhy 1456w&quot; loading=&quot;lazy&quot;&gt;&lt;/figure&gt;&lt;p&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=rC7mKYfGFqA&quot;&gt;In&#xAD;ves&#xAD;ti&#xAD;ga&#xAD;tors found&lt;/a&gt; filth clog&#xAD;ging the ma&#xAD;chines, en&#xAD;dan&#xAD;ger&#xAD;ing work&#xAD;ers. Lar&#xAD;val bins would rou&#xAD;tinely break open, lead&#xAD;ing to re&#xAD;peated show&#xAD;ers of lar&#xAD;vae. In&#xAD;sects crawled all over the place, wholly un&#xAD;con&#xAD;tained. Air&#xAD;borne lar&#xAD;val food floated hap&#xAD;haz&#xAD;ardly through the air, mak&#xAD;ing work&#xAD;ers sick. Birds wan&#xAD;dered freely through&#xAD;out the fa&#xAD;cil&#xAD;ity, their fe&#xAD;ces coat&#xAD;ing the ma&#xAD;chines. The whole area was a del&#xAD;uge of filth, fe&#xAD;ces, dis&#xAD;ease, and pol&#xAD;lu&#xAD;tion.&lt;/p&gt;&lt;figure&gt;&lt;img src=&quot;https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/m22GCyaKGd8gKBwFX/lem7nqufmrdhca8e3nd5&quot; alt=&quot;&quot; srcset=&quot;https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/m22GCyaKGd8gKBwFX/ties9jxo0sbm6plbtqn8 424w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/m22GCyaKGd8gKBwFX/y41kbqlzzbt08qxutmww 848w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/m22GCyaKGd8gKBwFX/sdsxmsghzx6mgkwkmdgx 1272w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/m22GCyaKGd8gKBwFX/lem7nqufmrdhca8e3nd5 1456w&quot; loading=&quot;lazy&quot;&gt;&lt;/figure&gt;&lt;p&gt;The pho&#xAD;tos above show what the place looked like. In per&#xAD;son, it was even more of a hor&#xAD;ror show. &lt;a href=&quot;https://www.youtube.com/watch?v=rC7mKYfGFqA&quot;&gt;In&#xAD;ves&#xAD;ti&#xAD;ga&#xAD;tors de&#xAD;scribed&lt;/a&gt;&lt;a href=&quot;https://benthams.substack.com/p/leading-insect-farm-took-millions#footnote-1-193511425&quot;&gt;&lt;sup&gt;1&lt;/sup&gt;&lt;/a&gt;:&lt;/p&gt;&lt;blockquote&gt;&lt;p&gt;And here, in the pho&#xAD;tos and videos, you don’t have the smells. In re&#xAD;al&#xAD;ity, it makes you dizzy. It’s im&#xAD;pos&#xAD;si&#xAD;ble to breathe. Even with a mask, it makes you feel sick.&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;The ac&#xAD;counts from work&#xAD;ers were even more har&#xAD;row&#xAD;ing.&lt;/p&gt;&lt;blockquote&gt;&lt;p&gt;There was no dust ex&#xAD;trac&#xAD;tion, noth&#xAD;ing at all. I started hav&#xAD;ing a runny nose, itchy eyes, a scratchy throat, my nose run&#xAD;ning, my eyes burn&#xAD;ing, and difficulty breath&#xAD;ing. Rashes on my hands, on my feet, on my knees… Every&#xAD;thing around the joints. I had great difficulty breath&#xAD;ing…es&#xAD;pe&#xAD;cially in the evening when I was at home, ly&#xAD;ing down, al&#xAD;most sleep&#xAD;ing out&#xAD;side.&lt;/p&gt;&lt;p&gt;Out&#xAD;side? Out&#xAD;side, in my stair&#xAD;well. I couldn’t sleep ly&#xAD;ing down in my bed any&#xAD;more. I held onto a railing, I thought I was go&#xAD;ing to pass out…&lt;/p&gt;&lt;p&gt;…&lt;/p&gt;&lt;p&gt;They told me they would con&#xAD;tact oc&#xAD;cu&#xAD;pa&#xAD;tional health, but they didn’t.&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;That worker later re&#xAD;layed, “My lungs were so swol&#xAD;len that they were press&#xAD;ing against my rib cage. And that’s what was pre&#xAD;vent&#xAD;ing me from breath&#xAD;ing.”&lt;/p&gt;&lt;p&gt;Yn&#xAD;sect is now bankrupt, hav&#xAD;ing taken mil&#xAD;lions of eu&#xAD;ros in pub&#xAD;lic fund&#xAD;ing and gen&#xAD;er&#xAD;ated es&#xAD;sen&#xAD;tially no rev&#xAD;enue. As Corentin Biteau, co-founder of ONEI which re&#xAD;searches in&#xAD;sect farm&#xAD;ing in France, notes “Yn&#xAD;sect’s 2023 figures were stark: €656K in rev&#xAD;enue for €80M in losses. Main&#xAD;tain&#xAD;ing in&#xAD;sects at 25–30&#xB0;C year-round, op&#xAD;er&#xAD;at&#xAD;ing com&#xAD;plex fa&#xAD;cil&#xAD;ities, and com&#xAD;pet&#xAD;ing with com&#xAD;mod&#xAD;ity feed prices proved too difficult.”&lt;/p&gt;&lt;p&gt;Yn&#xAD;sect was the in&#xAD;dus&#xAD;try leader. It was seen as the model of how to do in&#xAD;sect farm&#xAD;ing right. And yet even with hun&#xAD;dreds of mil&#xAD;lions of eu&#xAD;ros, it wasn’t re&#xAD;motely able to be eco&#xAD;nom&#xAD;i&#xAD;cally vi&#xAD;able. If the in&#xAD;dus&#xAD;try leader can’t avoid bankruptcy, and no one can figure out how to have a re&#xAD;motely vi&#xAD;able busi&#xAD;ness model, what hope is there for the in&#xAD;sect farm&#xAD;ing in&#xAD;dus&#xAD;try?&lt;/p&gt;&lt;p&gt;What are we do&#xAD;ing? Why do tax&#xAD;pay&#xAD;ers have to fund these eco&#xAD;nom&#xAD;i&#xAD;cally un&#xAD;sus&#xAD;tain&#xAD;able, en&#xAD;vi&#xAD;ron&#xAD;men&#xAD;tally de&#xAD;struc&#xAD;tive dis&#xAD;asters that poi&#xAD;son work&#xAD;ers and pol&#xAD;lute the lo&#xAD;cal en&#xAD;vi&#xAD;ron&#xAD;ments? Why is the gov&#xAD;ern&#xAD;ment tak&#xAD;ing your money and my money to fund dis&#xAD;aster in&#xAD;sect farms that can’t pay for them&#xAD;selves?&lt;/p&gt;&lt;p&gt;On ev&#xAD;ery front, the in&#xAD;sect farm&#xAD;ing in&#xAD;dus&#xAD;try is a dis&#xAD;aster.&lt;/p&gt;&lt;p&gt;It’s a dis&#xAD;aster eco&#xAD;nom&#xAD;i&#xAD;cally. In re&#xAD;cent years, a quar&#xAD;ter of the &lt;a href=&quot;https://undark.org/2026/03/17/insect-farming-bankrupt/&quot;&gt;20 largest in&#xAD;sect farms&lt;/a&gt; have gone bankrupt, and many of the ones that re&#xAD;main &lt;a href=&quot;https://www.onei-insectes.org/en/overview-challenges&quot;&gt;teeter on the edge of bankruptcy&lt;/a&gt;. The largest Dutch news&#xAD;pa&#xAD;per notes that in&#xAD;vestors have got&#xAD;ten in&#xAD;creas&#xAD;ingly wary about fund&#xAD;ing in&#xAD;sect farms af&#xAD;ter their re&#xAD;peated &lt;a href=&quot;https://fd.nl/bedrijfsleven/1555354/hoe-investeerders-zich-volkomen-verslikten-in-eetbare-insecten&quot;&gt;eco&#xAD;nomic im&#xAD;plo&#xAD;sions&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;What is the path for&#xAD;ward? It won’t be feed&#xAD;ing farmed in&#xAD;sects to peo&#xAD;ple, who don’t want to eat in&#xAD;sects and cer&#xAD;tainly won’t eat black sol&#xAD;dier flies—the main species be&#xAD;ing farmed. The in&#xAD;dus&#xAD;try’s aim is to provide low-qual&#xAD;ity feed to fac&#xAD;tory farmed an&#xAD;i&#xAD;mals. But in&#xAD;sect feed is much more ex&#xAD;pen&#xAD;sive than al&#xAD;ter&#xAD;na&#xAD;tive feeds, and that’s &lt;a href=&quot;https://www.sciencedirect.com/science/article/pii/S2949824424001587?via%3Dihub&quot;&gt;highly un&#xAD;likely to change in the fu&#xAD;ture&lt;/a&gt;. Main&#xAD;tain&#xAD;ing ap&#xAD;pro&#xAD;pri&#xAD;ate tem&#xAD;per&#xAD;a&#xAD;tures so that the in&#xAD;sects can grow is ex&#xAD;pen&#xAD;sive, and the in&#xAD;dus&#xAD;try hasn’t found a way to cut costs.&lt;/p&gt;&lt;p&gt;It’s a dis&#xAD;aster en&#xAD;vi&#xAD;ron&#xAD;men&#xAD;tally. &lt;a href=&quot;https://www.feedandadditive.com/new-report-insect-protein-could-be-13-5-times-worse-for-climate-than-soy/&quot;&gt;In&#xAD;sects be&#xAD;ing fed to an&#xAD;i&#xAD;mals gen&#xAD;er&#xAD;ate over 13 times&lt;/a&gt; more car&#xAD;bon emis&#xAD;sions than soy-based al&#xAD;ter&#xAD;na&#xAD;tives. No&#xAD;body has been able to ex&#xAD;plain what the path for&#xAD;ward is for the in&#xAD;dus&#xAD;try—even as they de&#xAD;mand &lt;a href=&quot;https://www.sciencedirect.com/science/article/pii/S235255092400191X?via%3Dihub&quot;&gt;more fund&#xAD;ing&lt;/a&gt; from un&#xAD;will&#xAD;ing tax&#xAD;pay&#xAD;ers.&lt;/p&gt;&lt;p&gt;When peo&#xAD;ple tout in&#xAD;sect farm&#xAD;ing as en&#xAD;vi&#xAD;ron&#xAD;men&#xAD;tally sus&#xAD;tain&#xAD;able, they nor&#xAD;mally com&#xAD;pare it to meat. And sure, if you eat a bug burger in&#xAD;stead of a beef burger, that will be bet&#xAD;ter for the en&#xAD;vi&#xAD;ron&#xAD;ment. But &lt;i&gt;no one wants to eat burg&#xAD;ers made from bugs, &lt;/i&gt;so in&#xAD;stead the in&#xAD;sects are fed to the an&#xAD;i&#xAD;mals we eat. In&#xAD;sect farm&#xAD;ing is an in&#xAD;dus&#xAD;try &lt;a href=&quot;https://www.reuters.com/sustainability/land-use-biodiversity/why-insect-farming-is-no-silver-bullet-drive-wean-world-off-meat-2025-03-20/&quot;&gt;com&#xAD;ple&#xAD;men&#xAD;tary to con&#xAD;ven&#xAD;tional fac&#xAD;tory farm&#xAD;ing&lt;/a&gt;, not a com&#xAD;peti&#xAD;tor. If you’re con&#xAD;cerned about the en&#xAD;vi&#xAD;ron&#xAD;men&#xAD;tal foot&#xAD;print of fac&#xAD;tory farms, that is an ar&#xAD;gu&#xAD;ment against in&#xAD;sect farm&#xAD;ing, not for it.&lt;/p&gt;&lt;p&gt;It’s a dis&#xAD;aster on an&#xAD;i&#xAD;mal welfare grounds. &lt;a href=&quot;/posts/mdcSeMwkBEYhdTAWF/to-a-first-approximation-all-farmed-animals-are-bugs&quot;&gt;Most an&#xAD;i&#xAD;mals kil&#xAD;led each year are now farmed in&#xAD;sects&lt;/a&gt;, and they’re kept in hor&#xAD;rify&#xAD;ing con&#xAD;di&#xAD;tions with&#xAD;out any welfare pro&#xAD;tec&#xAD;tions, even as ev&#xAD;i&#xAD;dence is in&#xAD;creas&#xAD;ingly &lt;a href=&quot;https://asteriskmag.com/issues/09/the-case-for-insect-consciousness&quot;&gt;com&#xAD;ing in for their sen&#xAD;tience&lt;/a&gt;. They’re of&#xAD;ten kil&#xAD;led by be&#xAD;ing &lt;a href=&quot;https://rethinkpriorities.org/research-area/welfare-considerations-for-farmed-black-soldier-flies-hermetia-illucens/&quot;&gt;microwaved, suffo&#xAD;cated, frozen, or boiled&lt;/a&gt;. We should be wary of boiling po&#xAD;ten&#xAD;tially sen&#xAD;tient be&#xAD;ings &lt;a href=&quot;https://rethinkpriorities.org/research-area/investments-into-insect-farming/&quot;&gt;by the trillions&lt;/a&gt;, even if they’re small and don’t have many neu&#xAD;rons. We shouldn’t do it un&#xAD;less there’s a good rea&#xAD;son.&lt;/p&gt;&lt;p&gt;On the farms, in&#xAD;sects are kept in over&#xAD;crowded con&#xAD;di&#xAD;tions where they ex&#xAD;press &lt;a href=&quot;https://rethinkpriorities.org/research-area/welfare-considerations-for-farmed-black-soldier-flies-hermetia-illucens/&quot;&gt;none of their nat&#xAD;u&#xAD;ral be&#xAD;hav&#xAD;iors&lt;/a&gt;. Disease is a ubiquitous fea&#xAD;ture of their lives. A re&#xAD;port by Re&#xAD;think Pri&#xAD;ori&#xAD;ties &lt;a href=&quot;https://rethinkpriorities.org/research-area/welfare-considerations-for-farmed-black-soldier-flies-hermetia-illucens/&quot;&gt;notes&lt;/a&gt; “re&#xAD;ports of larger dis&#xAD;ease out&#xAD;breaks in the in&#xAD;dus&#xAD;try abound, in&#xAD;clud&#xAD;ing viral and fun&#xAD;gal in&#xAD;fec&#xAD;tions (re&#xAD;viewed in the pa&#xAD;per); these dis&#xAD;eases may be as&#xAD;so&#xAD;ci&#xAD;ated with sig&#xAD;nifi&#xAD;cant mor&#xAD;tal&#xAD;ity, as well as suffer&#xAD;ing re&#xAD;lated to symp&#xAD;toms that de&#xAD;velop be&#xAD;fore death.”&lt;/p&gt;&lt;p&gt;I think we go wrong in our ne&#xAD;glect of in&#xAD;sects and they mat&#xAD;ter more than we naively think. It seems &lt;a href=&quot;https://benthams.substack.com/p/betting-on-ubiquitous-pain?utm_source=publication-search&quot;&gt;rea&#xAD;son&#xAD;ably likely&lt;/a&gt;, in my view, that they can feel fairly in&#xAD;tense pain, and the &lt;a href=&quot;https://benthams.substack.com/p/thinking-insect-suffering-is-the?utm_source=publication-search&quot;&gt;moral ar&#xAD;gu&#xAD;ments against tak&#xAD;ing their pain se&#xAD;ri&#xAD;ously&lt;/a&gt; are not con&#xAD;vinc&#xAD;ing. But even if you re&#xAD;ject this view, you should always be wary about tor&#xAD;tur&#xAD;ing trillions of be&#xAD;ings for &lt;i&gt;ba&#xAD;si&#xAD;cally no rea&#xAD;son&lt;/i&gt;. We should strive to be com&#xAD;pas&#xAD;sion&#xAD;ate and not hurt an&#xAD;i&#xAD;mals in large num&#xAD;ber if there’s no need to do so.&lt;/p&gt;&lt;p&gt;You don’t have to be a bug rights ac&#xAD;tivist to think that if we’re farm&#xAD;ing in hideous con&#xAD;di&#xAD;tions so many bugs that most farmed an&#xAD;i&#xAD;mals we slaugh&#xAD;ter are now bugs, and so far no benefits have ma&#xAD;te&#xAD;ri&#xAD;al&#xAD;ized at all, and there’s no story of how fu&#xAD;ture benefits might ma&#xAD;te&#xAD;ri&#xAD;al&#xAD;ize, and their welfare is wholly ig&#xAD;nored even to the point where they’re microwaved to death, some&#xAD;thing very bad is go&#xAD;ing on. Scott Alexan&#xAD;der &lt;a href=&quot;https://www.astralcodexten.com/p/i-will-not-eat-the-bugs&quot;&gt;put it well&lt;/a&gt;:&lt;/p&gt;&lt;blockquote&gt;&lt;p&gt;In the same way, even if there’s only a 50-50 chance in&#xAD;sects have moral value, or a 1% chance, still seems like you should avoid fac&#xAD;tory-farm&#xAD;ing and kil&#xAD;ling ten trillion of them, which is about how many we cur&#xAD;rently farm&lt;/p&gt;&lt;p&gt;…&lt;/p&gt;&lt;p&gt;Really any nor&#xAD;mal per&#xAD;son should be able to take care of all their in&#xAD;sect-harm&#xAD;ing needs with&#xAD;out go&#xAD;ing over what&#xAD;ever moral bud&#xAD;get they set for them&#xAD;selves—the same kind of ve&#xAD;nial sin as buy&#xAD;ing a ba&#xAD;nana even though this is prob&#xAD;a&#xAD;bly bad for the rain&#xAD;for&#xAD;est some&#xAD;how. It’s not a prob&#xAD;lem &lt;i&gt;un&#xAD;less you’re fac&#xAD;tory-farm&#xAD;ing ten trillion in&#xAD;sects&lt;/i&gt;, at which point it re&#xAD;ally starts to add up.&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;With con&#xAD;di&#xAD;tions this dire, with fe&#xAD;ces cov&#xAD;er&#xAD;ing and in&#xAD;sects crawl&#xAD;ing over di&#xAD;lap&#xAD;i&#xAD;dated ma&#xAD;chines, where work&#xAD;ers strug&#xAD;gle to breathe, the catas&#xAD;tro&#xAD;phe isn’t con&#xAD;tained. It spreads to the sur&#xAD;round&#xAD;ing coun&#xAD;tryside, bring&#xAD;ing novel dis&#xAD;eases to nearby wildlife. Alarm&#xAD;ingly, dis&#xAD;eases on in&#xAD;sect farms could spread. A pa&#xAD;per by &lt;a href=&quot;https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0219303&amp;amp;type=printable&quot;&gt;Gałęcki &amp;amp; Sok&#xF3;ł&lt;/a&gt; found par&#xAD;a&#xAD;sites in 81.33% of ex&#xAD;am&#xAD;ined in&#xAD;sect farms. In 30.33% of cases, par&#xAD;a&#xAD;sites could in&#xAD;fect hu&#xAD;mans. Should we risk in&#xAD;no&#xAD;cent peo&#xAD;ple be&#xAD;ing in&#xAD;fected by par&#xAD;a&#xAD;sites so that in&#xAD;sect farms can make costly feed for farmed fish?&lt;/p&gt;&lt;p&gt;It would be one thing if in&#xAD;sect farm&#xAD;ing pro&#xAD;vided an al&#xAD;ter&#xAD;na&#xAD;tive to con&#xAD;ven&#xAD;tional meat. At least in that case there would be some ar&#xAD;gu&#xAD;ment for tol&#xAD;er&#xAD;at&#xAD;ing them. But in light of the fact that they are &lt;i&gt;wholly with&#xAD;out up&#xAD;side&lt;/i&gt;, tor&#xAD;ment&#xAD;ing trillions of in&#xAD;sects un&#xAD;nec&#xAD;es&#xAD;sar&#xAD;ily, mak&#xAD;ing peo&#xAD;ple sick, and poi&#xAD;son&#xAD;ing and pol&#xAD;lut&#xAD;ing the sur&#xAD;round&#xAD;ing land&#xAD;scape, they must go. It is a catas&#xAD;tro&#xAD;phe and an em&#xAD;bar&#xAD;rass&#xAD;ment that so many have been duped by their bo&#xAD;gus promises of re&#xAD;plac&#xAD;ing the meat in&#xAD;dus&#xAD;try and that tax&#xAD;pay&#xAD;ers must fund their craven op&#xAD;er&#xAD;a&#xAD;tions.&lt;/p&gt;</description>
            <author>Bentham&apos;s Bulldog</author>
            <guid>m22GCyaKGd8gKBwFX</guid>
            <pubDate>Fri, 10 Apr 2026 15:42:41 +0000</pubDate>
        </item>
        <item>
            <title>AISN #71: Cyberattacks &amp; Datacenter Moratorium Bill by Alice Blair</title>
            <link>https://forum.nunosempere.com/posts/mqmJZ4DsgKbM52QNB/aisn-71-cyberattacks-and-datacenter-moratorium-bill</link>
            <description>&lt;p&gt;&lt;i&gt;Also, up&#xAD;dates on the An&#xAD;thropic vs. Pen&#xAD;tagon court case.&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;We’re Hiring.&lt;/strong&gt; Op&#xAD;por&#xAD;tu&#xAD;ni&#xAD;ties at CAIS in&#xAD;clude: &lt;a href=&quot;https://jobs.lever.co/aisafety/5cc2f823-5757-4e00-b2d6-aaf9c832735d&quot;&gt;Head of Public En&#xAD;gage&#xAD;ment&lt;/a&gt;,&lt;a href=&quot;https://jobs.lever.co/aisafety/02e2df24-49d8-4d99-970f-4f7e98900133&quot;&gt;&lt;/a&gt;&lt;a href=&quot;https://jobs.lever.co/aisafety/1d294768-31cd-4d00-a238-a3eded93c695&quot;&gt;Prin&#xAD;ci&#xAD;pal, Spe&#xAD;cial Pro&#xAD;jects&lt;/a&gt;, &lt;a href=&quot;https://jobs.lever.co/aisafety/0431d90d-82d9-4f82-b89b-ce51974906e7&quot;&gt;Pro&#xAD;gram Man&#xAD;ager&lt;/a&gt;, &lt;a href=&quot;https://jobs.lever.co/aisafety/f0218805-28e2-4da5-a002-dddb8dfce7fd&quot;&gt;Oper&#xAD;a&#xAD;tions Man&#xAD;ager&lt;/a&gt;, and &lt;a href=&quot;https://jobs.lever.co/aisafety&quot;&gt;other roles&lt;/a&gt;. If you’re in&#xAD;ter&#xAD;ested in work&#xAD;ing on re&#xAD;duc&#xAD;ing AI risk alongside a tal&#xAD;ented, mis&#xAD;sion-driven team, con&#xAD;sider ap&#xAD;ply&#xAD;ing!&lt;/p&gt;&lt;h1&gt;AI Soft&#xAD;ware In&#xAD;fras&#xAD;truc&#xAD;ture Cyberattacks&lt;/h1&gt;&lt;p&gt;Re&#xAD;cently, cy&#xAD;ber&#xAD;at&#xAD;tacks tar&#xAD;get&#xAD;ing the AI in&#xAD;dus&#xAD;try’s soft&#xAD;ware in&#xAD;fras&#xAD;truc&#xAD;ture stole pri&#xAD;vate in&#xAD;for&#xAD;ma&#xAD;tion po&#xAD;ten&#xAD;tially worth billions of dol&#xAD;lars and in&#xAD;serted back&#xAD;doors into de&#xAD;vel&#xAD;op&#xAD;ers’ com&#xAD;put&#xAD;ers. Google Threat In&#xAD;tel&#xAD;li&#xAD;gence Group &lt;a href=&quot;https://cloud.google.com/blog/topics/threat-intelligence/north-korea-threat-actor-targets-axios-npm-package/&quot;&gt;re&#xAD;ported&lt;/a&gt; that one of the largest cy&#xAD;ber&#xAD;at&#xAD;tacks in this wave was car&#xAD;ried out by North Korea-linked hack&#xAD;ers.&lt;/p&gt;&lt;figure&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/v1775828447/lexical_client_uploads/o8z2bfklfpgin9tfczq5.png&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/figure&gt;&lt;p&gt;&lt;strong&gt;The stolen data may be worth billions. &lt;/strong&gt;Hack&#xAD;ers &lt;a href=&quot;https://techcrunch.com/2026/03/31/mercor-says-it-was-hit-by-cyberattack-tied-to-compromise-of-open-source-litellm-project/&quot;&gt;stole and auc&#xAD;tioned&lt;/a&gt; pri&#xAD;vate data from Mer&#xAD;cor, an AI train&#xAD;ing data sup&#xAD;plier for OpenAI and An&#xAD;thropic which was re&#xAD;cently val&#xAD;ued at $10 billion. Mer&#xAD;cor col&#xAD;lects AI train&#xAD;ing data from a large num&#xAD;ber of ex&#xAD;perts, as well as highly sen&#xAD;si&#xAD;tive &lt;a href=&quot;https://isc.sans.edu/diary/TeamPCP+Supply+Chain+Campaign+Update+005+First+Confirmed+Victim+Disclosure+PostCompromise+Cloud+Enumeration+Documented+and+Axios+Attribution+Narrows/32856&quot;&gt;per&#xAD;sonal and bio&#xAD;met&#xAD;ric data&lt;/a&gt; for iden&#xAD;tity ver&#xAD;ifi&#xAD;ca&#xAD;tion. This at&#xAD;tack not only com&#xAD;prises the data that Mer&#xAD;cor sells, but also in&#xAD;ter&#xAD;nal data that could be used to im&#xAD;per&#xAD;son&#xAD;ate their hired ex&#xAD;perts. A per&#xAD;son fa&#xAD;mil&#xAD;iar with the situ&#xAD;a&#xAD;tion stated that Mer&#xAD;cor has paid the hack&#xAD;ers’ re&#xAD;quested ran&#xAD;som, al&#xAD;though it re&#xAD;mains un&#xAD;clear if the hack&#xAD;ers in&#xAD;tend to re&#xAD;lease or sell the data re&#xAD;gard&#xAD;less.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;AI am&#xAD;plifies cy&#xAD;ber risks. &lt;/strong&gt;LLMs dra&#xAD;mat&#xAD;i&#xAD;cally lower the bar for ex&#xAD;e&#xAD;cut&#xAD;ing suc&#xAD;cess&#xAD;ful cy&#xAD;ber&#xAD;at&#xAD;tacks, and con&#xAD;tinue to rapidly be&#xAD;come more ad&#xAD;vanced. An ex&#xAD;per&#xAD;i&#xAD;ment in 2025 &lt;a href=&quot;https://newsletter.mlsafety.org/i/190670410/real-world-ai-cyberoffense-evaluation&quot;&gt;showed&lt;/a&gt; LLMs perform&#xAD;ing real-world cy&#xAD;beroffense bet&#xAD;ter than many hu&#xAD;man cy&#xAD;beroffense pro&#xAD;fes&#xAD;sion&#xAD;als. An&#xAD;thropic re&#xAD;cently &lt;a href=&quot;https://red.anthropic.com/2026/mythos-preview/&quot;&gt;an&#xAD;nounced&lt;/a&gt; Claude Mythos, a closed-ac&#xAD;cess LLM that has found crit&#xAD;i&#xAD;cal vuln&#xAD;er&#xAD;a&#xAD;bil&#xAD;ities in ev&#xAD;ery ma&#xAD;jor op&#xAD;er&#xAD;at&#xAD;ing sys&#xAD;tem and browser, sig&#xAD;nifi&#xAD;cantly ad&#xAD;vanc&#xAD;ing AI cy&#xAD;beroffense. Ad&#xAD;di&#xAD;tion&#xAD;ally, AI cy&#xAD;ber&#xAD;at&#xAD;tack&#xAD;ers can be copied many times, al&#xAD;low&#xAD;ing for at&#xAD;tacks on much broader sec&#xAD;tions of the AI soft&#xAD;ware ecosys&#xAD;tem for sig&#xAD;nifi&#xAD;cantly lower costs than hu&#xAD;man la&#xAD;bor.&lt;/p&gt;&lt;h1&gt;Dat&#xAD;a&#xAD;cen&#xAD;ter Mo&#xAD;ra&#xAD;to&#xAD;rium and Ex&#xAD;port Con&#xAD;trols Bill&lt;/h1&gt;&lt;figure&gt;&lt;div class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/v1775828452/lexical_client_uploads/czx5tuowgrerwhvm5ne3.png&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/div&gt;&lt;figcaption&gt;OpenAI’s Star&#xAD;gate dat&#xAD;a&#xAD;cen&#xAD;ter con&#xAD;struc&#xAD;tion pro&#xAD;ject in Abilene, Texas.&lt;/figcaption&gt;&lt;figcaption&gt;OpenAI’s Star&#xAD;gate dat&#xAD;a&#xAD;cen&#xAD;ter con&#xAD;struc&#xAD;tion pro&#xAD;ject in Abilene, Texas.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;Bernie San&#xAD;ders and Alexan&#xAD;dria Oca&#xAD;sio-Cortez in&#xAD;tro&#xAD;duced a new bill to ban the con&#xAD;struc&#xAD;tion of AI dat&#xAD;a&#xAD;cen&#xAD;ters un&#xAD;til sev&#xAD;eral safety con&#xAD;di&#xAD;tions have been met, and to pre&#xAD;vent ex&#xAD;port to coun&#xAD;tries with&#xAD;out “com&#xAD;pa&#xAD;rable” safety mea&#xAD;sures.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;The bill bans dat&#xAD;a&#xAD;cen&#xAD;ter con&#xAD;struc&#xAD;tion un&#xAD;til sev&#xAD;eral new reg&#xAD;u&#xAD;la&#xAD;tions have been passed. &lt;/strong&gt;If the bill passes, the mora&#xAD;to&#xAD;rium can only be re&#xAD;moved if congress ex&#xAD;plic&#xAD;itly passes laws to re&#xAD;move the mora&#xAD;to&#xAD;rium and satisfy the fol&#xAD;low&#xAD;ing con&#xAD;di&#xAD;tions:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;strong&gt;Fed&#xAD;eral pre-mar&#xAD;ket re&#xAD;view of AI prod&#xAD;ucts: &lt;/strong&gt;The gov&#xAD;ern&#xAD;ment must re&#xAD;view and ap&#xAD;prove AI prod&#xAD;ucts be&#xAD;fore re&#xAD;lease, en&#xAD;sur&#xAD;ing they’re “safe and effec&#xAD;tive” and don’t threaten health, pri&#xAD;vacy, civil rights, or the fu&#xAD;ture of hu&#xAD;man&#xAD;ity.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;strong&gt;Worker pro&#xAD;tec&#xAD;tions: &lt;/strong&gt;A law must pre&#xAD;vent job dis&#xAD;place&#xAD;ment and en&#xAD;sure that the wealth gen&#xAD;er&#xAD;ated by AI/​robotics is “shared with the peo&#xAD;ple of the United States.”&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;strong&gt;Dat&#xAD;a&#xAD;cen&#xAD;ter con&#xAD;struc&#xAD;tion re&#xAD;quire&#xAD;ments: &lt;/strong&gt;Any dat&#xAD;a&#xAD;cen&#xAD;ters built af&#xAD;ter the mora&#xAD;to&#xAD;rium must meet a se&#xAD;ries of eco&#xAD;nomic and en&#xAD;vi&#xAD;ron&#xAD;men&#xAD;tal re&#xAD;views.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;figure&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/v1775828445/lexical_client_uploads/hzmes3ijvudmyigxuvvu.png&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/figure&gt;&lt;p&gt;&lt;strong&gt;The bill acts as a tem&#xAD;po&#xAD;rary blan&#xAD;ket ban on all AI chip ex&#xAD;ports. &lt;/strong&gt;No coun&#xAD;try cur&#xAD;rently meets the bill’s dat&#xAD;a&#xAD;cen&#xAD;ter re&#xAD;quire&#xAD;ments, mean&#xAD;ing that the bill would ban all AI chip ex&#xAD;ports out of the US if it is passed. Ad&#xAD;di&#xAD;tion&#xAD;ally, the bill leaves sev&#xAD;eral defi&#xAD;ni&#xAD;tions up to in&#xAD;ter&#xAD;pre&#xAD;ta&#xAD;tion by reg&#xAD;u&#xAD;la&#xAD;tors, such as what con&#xAD;sti&#xAD;tutes “com&#xAD;pa&#xAD;rable” reg&#xAD;u&#xAD;la&#xAD;tions in other coun&#xAD;tries.&lt;/p&gt;&lt;h1&gt;An&#xAD;thropic v. Depart&#xAD;ment of War Lawsuit&lt;/h1&gt;&lt;p&gt;In early March, the Depart&#xAD;ment of War (DoW) des&#xAD;ig&#xAD;nated An&#xAD;thropic a sup&#xAD;ply chain risk (SCR), re&#xAD;strict&#xAD;ing their abil&#xAD;ity to do busi&#xAD;ness with mil&#xAD;i&#xAD;tary con&#xAD;trac&#xAD;tors and the mil&#xAD;i&#xAD;tary it&#xAD;self. The DoW used two fed&#xAD;eral statutes in&#xAD;tended for ad&#xAD;ver&#xAD;saries and sabo&#xAD;teurs, de&#xAD;spite the fact that the DoW and An&#xAD;thropic’s con&#xAD;flict emerged from a &lt;a href=&quot;https://newsletter.safe.ai/p/ai-safety-newsletter-69-department&quot;&gt;con&#xAD;tract dis&#xAD;pute&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Soon af&#xAD;ter, An&#xAD;thropic challenged the des&#xAD;ig&#xAD;na&#xAD;tions in court, and Judge Rita Lin in the North&#xAD;ern District of Cal&#xAD;ifor&#xAD;nia has is&#xAD;sued a pre&#xAD;limi&#xAD;nary in&#xAD;junc&#xAD;tion to stop one of the two SCR des&#xAD;ig&#xAD;na&#xAD;tions un&#xAD;til a per&#xAD;ma&#xAD;nent de&#xAD;ci&#xAD;sion is reached. The other SCR des&#xAD;ig&#xAD;na&#xAD;tion is be&#xAD;ing challenged in the D.C. Cir&#xAD;cuit.&lt;/p&gt;&lt;figure&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/v1775828447/lexical_client_uploads/hxlg5hdabiboaiyec36h.png&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/figure&gt;&lt;p&gt;&lt;strong&gt;The court has taken a strong stance against the DoW.&lt;/strong&gt; Judge Lin’s &lt;a href=&quot;https://storage.courtlistener.com/recap/gov.uscourts.cand.465515/gov.uscourts.cand.465515.134.0.pdf&quot;&gt;opinion&lt;/a&gt; (above) ac&#xAD;com&#xAD;pa&#xAD;ny&#xAD;ing the pre&#xAD;limi&#xAD;nary in&#xAD;junc&#xAD;tion de&#xAD;scribes the DoW’s ac&#xAD;tions as “Or&#xAD;wellian,” say&#xAD;ing that An&#xAD;thropic was ille&#xAD;gally “branded a po&#xAD;ten&#xAD;tial ad&#xAD;ver&#xAD;sary and sabo&#xAD;teur of the U.S. for ex&#xAD;press&#xAD;ing dis&#xAD;agree&#xAD;ment with the gov&#xAD;ern&#xAD;ment.”&lt;/p&gt;&lt;figure&gt;&lt;div class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/v1775828445/lexical_client_uploads/ck3j5arewhp3dk86kmm1.png&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/div&gt;&lt;figcaption&gt;The cover page of An&#xAD;thropic’s law&#xAD;suit against the DoW in Cal&#xAD;ifor&#xAD;nia, show&#xAD;ing sev&#xAD;eral of the gov&#xAD;ern&#xAD;ment agen&#xAD;cies named in the law&#xAD;suit. (&lt;a href=&quot;https://storage.courtlistener.com/recap/gov.uscourts.cand.465515/gov.uscourts.cand.465515.1.0_5.pdf&quot;&gt;source&lt;/a&gt;)&lt;/figcaption&gt;&lt;figcaption&gt;The cover page of An&#xAD;thropic’s law&#xAD;suit against the DoW in Cal&#xAD;ifor&#xAD;nia, show&#xAD;ing sev&#xAD;eral of the gov&#xAD;ern&#xAD;ment agen&#xAD;cies named in the law&#xAD;suit. (&lt;a href=&quot;https://storage.courtlistener.com/recap/gov.uscourts.cand.465515/gov.uscourts.cand.465515.1.0_5.pdf&quot;&gt;source&lt;/a&gt;)&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;strong&gt;The DoW’s le&#xAD;gal ar&#xAD;gu&#xAD;ments di&#xAD;verged sig&#xAD;nifi&#xAD;cantly from pub&#xAD;lic rhetoric.&lt;/strong&gt; De&#xAD;spite the DoW’s &lt;a href=&quot;https://x.com/SecWar/status/2027507717469049070&quot;&gt;state&#xAD;ments&lt;/a&gt; about ur&#xAD;gent “be&#xAD;trayal” from An&#xAD;thropic, their le&#xAD;gal case for the SCR des&#xAD;ig&#xAD;na&#xAD;tion cen&#xAD;tered around risk of fu&#xAD;ture sab&#xAD;o&#xAD;tage. An&#xAD;thropic has ar&#xAD;gued that Trump’s pub&#xAD;lic state&#xAD;ments or&#xAD;der&#xAD;ing the en&#xAD;tire US gov&#xAD;ern&#xAD;ment to “IMMEDIATELY CEASE all use of An&#xAD;thropic’s tech&#xAD;nol&#xAD;ogy,” as well as Hegseth’s X posts, had harm&#xAD;ful effects be&#xAD;yond the offi&#xAD;cial SCR des&#xAD;ig&#xAD;na&#xAD;tions.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;The DoW’s case cen&#xAD;ters around the risk of sab&#xAD;o&#xAD;tage from An&#xAD;thropic.&lt;/strong&gt; The DoW ex&#xAD;pressed con&#xAD;cerns about risks from sab&#xAD;o&#xAD;taged AI sys&#xAD;tems, which “[have] weights and mea&#xAD;sures that are set by An&#xAD;thropic.” The DoW fur&#xAD;ther ar&#xAD;gued that this con&#xAD;trol would al&#xAD;low An&#xAD;thropic to in&#xAD;sert a back&#xAD;door or “kill switch” into the model. How&#xAD;ever, Judge Lin pushed back on the idea that this case was about sab&#xAD;o&#xAD;tage at all: “It is not my role de&#xAD;cide who’s right in that de&#xAD;bate,” she said in court, “I see the ques&#xAD;tion in this case as be&#xAD;ing a very differ&#xAD;ent one, which is whether the gov&#xAD;ern&#xAD;ment vi&#xAD;o&#xAD;lated the law.”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;An&#xAD;thropic’s case in Cal&#xAD;ifor&#xAD;nia is likely to suc&#xAD;ceed.&lt;/strong&gt; In the judge’s opinion ac&#xAD;com&#xAD;pa&#xAD;ny&#xAD;ing the pre&#xAD;limi&#xAD;nary in&#xAD;junc&#xAD;tion, she ar&#xAD;gued that An&#xAD;thropic is likely to win the case for sev&#xAD;eral in&#xAD;de&#xAD;pen&#xAD;dently suffi&#xAD;cient rea&#xAD;sons. For ex&#xAD;am&#xAD;ple, the DoW con&#xAD;ceded in court that they did not fol&#xAD;low the proper &lt;a href=&quot;https://uscode.house.gov/view.xhtml?req=granuleid:USC-prelim-title10-section3252&amp;amp;num=0&amp;amp;edition=prelim&quot;&gt;pro&#xAD;ce&#xAD;dure&lt;/a&gt; for SCR des&#xAD;ig&#xAD;na&#xAD;tion, which re&#xAD;quires no&#xAD;tify&#xAD;ing congress of “less in&#xAD;tru&#xAD;sive mea&#xAD;sures that were con&#xAD;sid&#xAD;ered and why they were not rea&#xAD;son&#xAD;ably available.” How&#xAD;ever, the DC Cir&#xAD;cuit has not granted An&#xAD;thropic’s re&#xAD;quest for an emer&#xAD;gency stay. The DoW is cur&#xAD;rently &lt;a href=&quot;https://abcnews.com/Business/wireStory/trump-administration-appeals-ruling-blocked-pentagon-action-anthropic-131657674&quot;&gt;ap&#xAD;peal&#xAD;ing&lt;/a&gt; the pre&#xAD;limi&#xAD;nary in&#xAD;junc&#xAD;tion to the 9th Cir&#xAD;cuit Court of Ap&#xAD;peals.&lt;/p&gt;&lt;h1&gt;In Other News&lt;/h1&gt;&lt;h3&gt;Government&lt;/h3&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;WIRED &lt;a href=&quot;https://www.wired.com/story/iran-threatens-to-start-attacking-major-us-tech-firms-on-april-1/&quot;&gt;re&#xAD;ports&lt;/a&gt; that Iran has threat&#xAD;ened strikes on Amer&#xAD;i&#xAD;can AI dat&#xAD;a&#xAD;cen&#xAD;ters in the Mid&#xAD;dle East be&#xAD;cause of &lt;a href=&quot;https://newsletter.safe.ai/i/191894330/ai-automation-of-warfare&quot;&gt;AI’s use in mil&#xAD;i&#xAD;tary tar&#xAD;get&#xAD;ing&lt;/a&gt; in Iran.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;The White House &lt;a href=&quot;https://x.com/whostp47/status/2036794285668851781&quot;&gt;ap&#xAD;pointed&lt;/a&gt; 13 ad&#xAD;vi&#xAD;sors on sci&#xAD;ence, con&#xAD;sist&#xAD;ing pri&#xAD;mar&#xAD;ily of AI and power in&#xAD;fras&#xAD;truc&#xAD;ture ex&#xAD;ec&#xAD;u&#xAD;tives.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;Industry&lt;/h3&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;An&#xAD;thropic &lt;a href=&quot;https://www.anthropic.com/glasswing&quot;&gt;an&#xAD;nounced&lt;/a&gt; Pro&#xAD;ject Glass&#xAD;wing, plans to use the new Claude Mythos model to defend cy&#xAD;ber in&#xAD;fras&#xAD;truc&#xAD;ture in prepa&#xAD;ra&#xAD;tion for more wide&#xAD;spread AI cy&#xAD;beroffense ca&#xAD;pa&#xAD;bil&#xAD;ities.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Meta &lt;a href=&quot;https://ai.meta.com/blog/introducing-muse-spark-msl/&quot;&gt;an&#xAD;nounced&lt;/a&gt; Muse Spark, a new closed-source model ap&#xAD;proach&#xAD;ing the fron&#xAD;tier.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;An&#xAD;thropic &lt;a href=&quot;https://www.axios.com/2026/03/31/anthropic-leaked-source-code-ai&quot;&gt;leaked&lt;/a&gt; the source code for Claude Code.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Google and Arcee AI re&#xAD;leased &lt;a href=&quot;https://blog.google/innovation-and-ai/technology/developers-tools/gemma-4/&quot;&gt;Gemma 4&lt;/a&gt; and &lt;a href=&quot;https://www.arcee.ai/blog/trinity-large-thinking&quot;&gt;Trinity-Large-Think&#xAD;ing&lt;/a&gt; re&#xAD;spec&#xAD;tively, two new and com&#xAD;pet&#xAD;i&#xAD;tive open-source LLMs.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;Civil Society&lt;/h3&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;https://www.humanetech.com/landing/the-ai-doc&quot;&gt;The AI Doc&lt;/a&gt;, a new doc&#xAD;u&#xAD;men&#xAD;tary about AI risks, is now in the&#xAD;aters.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;An at&#xAD;tacker shot at the house of an In&#xAD;di&#xAD;anapo&#xAD;lis city coun&#xAD;cilmem&#xAD;ber who voted to ap&#xAD;prove a lo&#xAD;cal dat&#xAD;a&#xAD;cen&#xAD;ter con&#xAD;struc&#xAD;tion pro&#xAD;ject, leav&#xAD;ing a note say&#xAD;ing “NO DATA CENTERS.”&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;OpenAI &lt;a href=&quot;https://sfstandard.com/2026/04/01/openai-ai-kids-safety-coalition/&quot;&gt;or&#xAD;ga&#xAD;nized&lt;/a&gt; a coal&#xAD;i&#xAD;tion about pro&#xAD;mot&#xAD;ing child safety in AI, claiming to part&#xAD;ner with sev&#xAD;eral child safety or&#xAD;ga&#xAD;ni&#xAD;za&#xAD;tions that were un&#xAD;aware of OpenAI’s in&#xAD;volve&#xAD;ment.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;If you’re read&#xAD;ing this, you might also be in&#xAD;ter&#xAD;ested in other work by the Cen&#xAD;ter for AI Safety. You can find more on the &lt;a href=&quot;https://www.safe.ai/&quot;&gt;CAIS web&#xAD;site&lt;/a&gt;, the &lt;a href=&quot;https://x.com/CAIS&quot;&gt;X ac&#xAD;count for CAIS&lt;/a&gt;, our pa&#xAD;per on &lt;a href=&quot;https://www.nationalsecurity.ai/&quot;&gt;su&#xAD;per&#xAD;in&#xAD;tel&#xAD;li&#xAD;gence strat&#xAD;egy&lt;/a&gt;, our &lt;a href=&quot;https://www.aisafetybook.com/&quot;&gt;AI safety text&#xAD;book and course&lt;/a&gt;, our &lt;a href=&quot;https://dashboard.safe.ai/&quot;&gt;AI dash&#xAD;board&lt;/a&gt;, and &lt;a href=&quot;http://ai-frontiers.org/&quot;&gt;AI Fron&#xAD;tiers&lt;/a&gt;, a plat&#xAD;form for ex&#xAD;pert com&#xAD;men&#xAD;tary and anal&#xAD;y&#xAD;sis on the tra&#xAD;jec&#xAD;tory of AI. You can listen to the AI safety newslet&#xAD;ter on &lt;a href=&quot;https://spotify.link/E6lHa1ij2Cb&quot;&gt;Spo&#xAD;tify&lt;/a&gt; or &lt;a href=&quot;https://podcasts.apple.com/us/podcast/ai-safety-newsletter/id1702875110&quot;&gt;Ap&#xAD;ple Pod&#xAD;casts&lt;/a&gt;.&lt;/p&gt;</description>
            <author>Alice Blair</author>
            <guid>mqmJZ4DsgKbM52QNB</guid>
            <pubDate>Fri, 10 Apr 2026 14:19:44 +0000</pubDate>
        </item>
        <item>
            <title>The (Ψ) Interspecific Affect GPT: A Tool for Interspecies Welfare Scaling by Wladimir J. Alonso</title>
            <link>https://forum.nunosempere.com/posts/KT5QyL2wsBNvw2s3t/the-ps-interspecific-affect-gpt-a-tool-for-interspecies</link>
            <description>&lt;p&gt;This post is the third in a short se&#xAD;ries about sen&#xAD;tience, ceilings of Pain and Plea&#xAD;sure in&#xAD;ten&#xAD;sity (‘af&#xAD;fec&#xAD;tive ceiling’, or ‘af&#xAD;fec&#xAD;tive ca&#xAD;pac&#xAD;ity’), and how to make in&#xAD;ter&#xAD;spe&#xAD;cific welfare com&#xAD;par&#xAD;i&#xAD;sons more ex&#xAD;plicit with&#xAD;out cre&#xAD;at&#xAD;ing false pre&#xAD;ci&#xAD;sion.&lt;/p&gt;&lt;p&gt;In the &lt;a href=&quot;/posts/novnNcFiWaaAvTKEi/do-primitive-sentient-organisms-feel-extreme-pain&quot;&gt;first post&lt;/a&gt;, we ap&#xAD;proached the prob&#xAD;lem through what was es&#xAD;sen&#xAD;tially a physics or in&#xAD;for&#xAD;ma&#xAD;tion-en&#xAD;cod&#xAD;ing lens, treat&#xAD;ing af&#xAD;fec&#xAD;tive ex&#xAD;pe&#xAD;rience as a sys&#xAD;tem for en&#xAD;cod&#xAD;ing and trans&#xAD;mit&#xAD;ting biolog&#xAD;i&#xAD;cally rele&#xAD;vant in&#xAD;for&#xAD;ma&#xAD;tion. In this view, Pain and Plea&#xAD;sure are sig&#xAD;nals whose prop&#xAD;er&#xAD;ties can be analysed in terms of how in&#xAD;for&#xAD;ma&#xAD;tion is rep&#xAD;re&#xAD;sented and pro&#xAD;cessed. Tak&#xAD;ing the hu&#xAD;man af&#xAD;fec&#xAD;tive sys&#xAD;tem as a refer&#xAD;ence point, we fo&#xAD;cused on two key fea&#xAD;tures of this sig&#xAD;nal&#xAD;ling sys&#xAD;tem: the range of in&#xAD;ten&#xAD;si&#xAD;ties it can rep&#xAD;re&#xAD;sent, and the re&#xAD;s&#xAD;olu&#xAD;tion with which differ&#xAD;ences in in&#xAD;ten&#xAD;sity can be dis&#xAD;t&#xAD;in&#xAD;guished. We then ex&#xAD;plored whether early forms of sen&#xAD;tience would already ex&#xAD;hibit these at&#xAD;tributes, or whether they might in&#xAD;stead be as&#xAD;so&#xAD;ci&#xAD;ated with more limited af&#xAD;fec&#xAD;tive range and/​or re&#xAD;s&#xAD;olu&#xAD;tion.&lt;/p&gt;&lt;p&gt;In the &lt;a href=&quot;/posts/Ey9JyujS56wHvQZKP/when-feeling-is-worth-it-a-cost-benefit-framework-for-the&quot;&gt;sec&#xAD;ond post&lt;/a&gt;, we ap&#xAD;proached the prob&#xAD;lem through an evolu&#xAD;tion&#xAD;ary lens, ar&#xAD;gu&#xAD;ing that sen&#xAD;tience and af&#xAD;fec&#xAD;tive ca&#xAD;pac&#xAD;i&#xAD;ties should be treated like other biolog&#xAD;i&#xAD;cal traits: shaped by evolu&#xAD;tion&#xAD;ary con&#xAD;straints and pay&#xAD;offs, with ex&#xAD;treme af&#xAD;fec&#xAD;tive in&#xAD;ten&#xAD;si&#xAD;ties re&#xAD;quiring adap&#xAD;tive jus&#xAD;tifi&#xAD;ca&#xAD;tion rather than aris&#xAD;ing au&#xAD;to&#xAD;mat&#xAD;i&#xAD;cally from neu&#xAD;ral com&#xAD;plex&#xAD;ity.&lt;/p&gt;&lt;p&gt;This third post marks the tran&#xAD;si&#xAD;tion from the the&#xAD;o&#xAD;ret&#xAD;i&#xAD;cal frame&#xAD;work to its prac&#xAD;ti&#xAD;cal ap&#xAD;pli&#xAD;ca&#xAD;tion. It in&#xAD;tro&#xAD;duces an ex&#xAD;per&#xAD;i&#xAD;men&#xAD;tal tool, the &lt;a href=&quot;https://chatgpt.com/g/g-695c117020e48191ab684c1caea2964a-ps-interspecific-affect&quot;&gt;In&#xAD;ter&#xAD;spe&#xAD;cific Affect GPT&lt;/a&gt;, de&#xAD;signed to op&#xAD;er&#xAD;a&#xAD;tional&#xAD;ize those ideas alongside ad&#xAD;di&#xAD;tional ones pre&#xAD;sented here, in a way that may even&#xAD;tu&#xAD;ally sup&#xAD;port in&#xAD;ter&#xAD;spe&#xAD;cific welfare com&#xAD;par&#xAD;i&#xAD;sons.&lt;/p&gt;&lt;p class=&quot;imgonly&quot;&gt;&lt;a href=&quot;https://chatgpt.com/g/g-695c117020e48191ab684c1caea2964a-ps-interspecific-affect&quot;&gt;&lt;div class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/6e81b033a93b6dfdd2ff504ca9acf66b7f311ba003be104a072c28f039da8026/kubljs040xedjqoymszp&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/div&gt;&lt;/a&gt;&lt;/p&gt;&lt;h2 data-internal-id=&quot;What_problem_the_Interspecific_Affect_tool_addresses&quot;&gt;What prob&#xAD;lem the In&#xAD;ter&#xAD;spe&#xAD;cific Affect tool addresses&lt;/h2&gt;&lt;p&gt;Within the &lt;a href=&quot;https://welfarefootprint.org/analytical-approach/&quot;&gt;Welfare Foot&#xAD;print Frame&#xAD;work&lt;/a&gt; (WFF), welfare is quan&#xAD;tified as time spent in differ&#xAD;ent af&#xAD;fec&#xAD;tive in&#xAD;ten&#xAD;sity cat&#xAD;e&#xAD;gories, where Pain and Plea&#xAD;sure are op&#xAD;er&#xAD;a&#xAD;tionally defined as any nega&#xAD;tive or pos&#xAD;i&#xAD;tive af&#xAD;fec&#xAD;tive states. In&#xAD;ten&#xAD;sity cat&#xAD;e&#xAD;gories are as&#xAD;signed rel&#xAD;a&#xAD;tive to each species’ own be&#xAD;hav&#xAD;ioral and phys&#xAD;iolog&#xAD;i&#xAD;cal in&#xAD;di&#xAD;ca&#xAD;tors of per&#xAD;ceived &lt;a href=&quot;https://welfarefootprint.org/technical-definitions/pain-intensities/&quot;&gt;Pain&lt;/a&gt; and &lt;a href=&quot;https://welfarefootprint.org/technical-definitions/pleasure-intensities/&quot;&gt;Plea&#xAD;sure in&#xAD;ten&#xAD;sity&lt;/a&gt;, al&#xAD;low&#xAD;ing ro&#xAD;bust within-species com&#xAD;par&#xAD;i&#xAD;sons with&#xAD;out re&#xAD;quiring in&#xAD;ter&#xAD;spe&#xAD;cific as&#xAD;sump&#xAD;tions.&lt;/p&gt;&lt;p&gt;A ma&#xAD;jor un&#xAD;re&#xAD;solved challenge arises, how&#xAD;ever, when one tries to com&#xAD;pare welfare across species. This difficulty is not unique to the WFF; it ap&#xAD;plies to any welfare met&#xAD;ric, since the in&#xAD;ten&#xAD;sity of af&#xAD;fec&#xAD;tive ex&#xAD;pe&#xAD;riences across differ&#xAD;ent sen&#xAD;tient be&#xAD;ings re&#xAD;mains one of sci&#xAD;ence’s most profound un&#xAD;knowns. In the WFF, this is ac&#xAD;knowl&#xAD;edged ex&#xAD;plic&#xAD;itly in &lt;a href=&quot;https://welfarefootprint.org/analytical-approach/&quot;&gt;Mo&#xAD;d&#xAD;ule (Ψ) In&#xAD;ter&#xAD;spe&#xAD;cific Scal&#xAD;ing&lt;/a&gt;, which high&#xAD;lights the need for trans&#xAD;par&#xAD;ent meth&#xAD;ods to ad&#xAD;dress cross-species com&#xAD;pa&#xAD;ra&#xAD;bil&#xAD;ity with&#xAD;out dis&#xAD;tort&#xAD;ing the in&#xAD;tegrity of species-spe&#xAD;cific welfare as&#xAD;sess&#xAD;ments.&lt;/p&gt;&lt;p&gt;At least two dis&#xAD;tinct ques&#xAD;tions arise when com&#xAD;par&#xAD;ing welfare across species. First, there is the ceiling ques&#xAD;tion: what is the high&#xAD;est af&#xAD;fec&#xAD;tive in&#xAD;ten&#xAD;sity a given species can plau&#xAD;si&#xAD;bly reach? Se&#xAD;cond, there is the time-map&#xAD;ping ques&#xAD;tion: if differ&#xAD;ent species can po&#xAD;ten&#xAD;tially ex&#xAD;pe&#xAD;rience com&#xAD;pa&#xAD;rable af&#xAD;fec&#xAD;tive in&#xAD;ten&#xAD;si&#xAD;ties, does a given unit of clock time cor&#xAD;re&#xAD;spond to the same mag&#xAD;ni&#xAD;tude of ex&#xAD;pe&#xAD;rienced Pain or Plea&#xAD;sure?&lt;/p&gt;&lt;p&gt;Both is&#xAD;sues have ap&#xAD;peared in the broader liter&#xAD;a&#xAD;ture on welfare range and in&#xAD;ter&#xAD;spe&#xAD;cific com&#xAD;par&#xAD;i&#xAD;son, but we make the dis&#xAD;tinc&#xAD;tion ex&#xAD;plicit here be&#xAD;cause the two ques&#xAD;tions are of&#xAD;ten run to&#xAD;gether. Our view is that the ceiling ques&#xAD;tion is fre&#xAD;quently the more de&#xAD;ci&#xAD;sive one: if a species can&#xAD;not plau&#xAD;si&#xAD;bly reach very high in&#xAD;ten&#xAD;si&#xAD;ties, later ad&#xAD;just&#xAD;ments in&#xAD;volv&#xAD;ing du&#xAD;ra&#xAD;tion or ag&#xAD;gre&#xAD;ga&#xAD;tion can&#xAD;not place it on a par with a sys&#xAD;tem that can. In other words, limits on max&#xAD;i&#xAD;mum in&#xAD;ten&#xAD;sity con&#xAD;strain the over&#xAD;all mag&#xAD;ni&#xAD;tude of Pain, re&#xAD;gard&#xAD;less of how ex&#xAD;pe&#xAD;rience un&#xAD;folds over time. The In&#xAD;ter&#xAD;spe&#xAD;cific Affect GPT is there&#xAD;fore de&#xAD;signed to ad&#xAD;dress this first ques&#xAD;tion, ex&#xAD;plic&#xAD;itly and pro&#xAD;vi&#xAD;sion&#xAD;ally: given the available ev&#xAD;i&#xAD;dence, where does a species’ max&#xAD;i&#xAD;mum plau&#xAD;si&#xAD;ble af&#xAD;fec&#xAD;tive ca&#xAD;pac&#xAD;ity lie rel&#xAD;a&#xAD;tive to a hu&#xAD;man-an&#xAD;chored refer&#xAD;ence scale?&lt;/p&gt;&lt;p&gt;This pri&#xAD;ori&#xAD;ti&#xAD;za&#xAD;tion should not be con&#xAD;fused with a re&#xAD;jec&#xAD;tion of pre&#xAD;cau&#xAD;tion&#xAD;ary rea&#xAD;son&#xAD;ing. The pre&#xAD;sent tool is de&#xAD;signed pri&#xAD;mar&#xAD;ily as a dis&#xAD;ci&#xAD;plined method for sci&#xAD;en&#xAD;tific in&#xAD;fer&#xAD;ence about af&#xAD;fec&#xAD;tive ceilings, not as a com&#xAD;plete de&#xAD;ci&#xAD;sion rule for ethics or policy. Ac&#xAD;cord&#xAD;ingly, its par&#xAD;si&#xAD;mo&#xAD;nious de&#xAD;fault in es&#xAD;ti&#xAD;mat&#xAD;ing af&#xAD;fec&#xAD;tive ceilings should be dis&#xAD;t&#xAD;in&#xAD;guished from the sep&#xAD;a&#xAD;rate ques&#xAD;tion of how un&#xAD;cer&#xAD;tainty ought to be han&#xAD;dled in down&#xAD;stream prac&#xAD;ti&#xAD;cal de&#xAD;ci&#xAD;sions when the costs of un&#xAD;der-at&#xAD;tri&#xAD;bu&#xAD;tion may be high.&lt;/p&gt;&lt;p&gt;This tool does not offer a full solu&#xAD;tion to in&#xAD;ter&#xAD;spe&#xAD;cific welfare scal&#xAD;ing. Rather, it is an at&#xAD;tempt to make one es&#xAD;pe&#xAD;cially im&#xAD;por&#xAD;tant part of that challenge more ex&#xAD;plicit, in&#xAD;spectable, and open to crit&#xAD;i&#xAD;cism. Feed&#xAD;back is wel&#xAD;come ei&#xAD;ther in the com&#xAD;ment sec&#xAD;tion here or through feed&#xAD;back on the tool it&#xAD;self. It also serves as a test case for how large lan&#xAD;guage mod&#xAD;els, cur&#xAD;rent and forth&#xAD;com&#xAD;ing, may sup&#xAD;port struc&#xAD;tured rea&#xAD;son&#xAD;ing on a sci&#xAD;en&#xAD;tifi&#xAD;cally difficult and philo&#xAD;soph&#xAD;i&#xAD;cally con&#xAD;tested is&#xAD;sue, while offer&#xAD;ing a more trans&#xAD;par&#xAD;ent, dis&#xAD;ci&#xAD;plined, and ev&#xAD;i&#xAD;dence-sen&#xAD;si&#xAD;tive way of ap&#xAD;proach&#xAD;ing the prob&#xAD;lem.&lt;/p&gt;&lt;h2 data-internal-id=&quot;The_role_of_human_anchored_reference_categories&quot;&gt;The role of hu&#xAD;man-an&#xAD;chored refer&#xAD;ence categories&lt;/h2&gt;&lt;p&gt;One difficulty in in&#xAD;ter&#xAD;species com&#xAD;par&#xAD;i&#xAD;sons is that the same terms for the in&#xAD;ten&#xAD;sity cat&#xAD;e&#xAD;gories (&lt;a href=&quot;https://welfarefootprint.org/technical-definitions/pain-intensities/&quot;&gt;An&#xAD;noy&#xAD;ing, Hurt&#xAD;ful, Dis&#xAD;abling, Ex&#xAD;cru&#xAD;ci&#xAD;at&#xAD;ing&lt;/a&gt;) can be used across taxa as if they referred to com&#xAD;pa&#xAD;rable mag&#xAD;ni&#xAD;tudes. Often, they do not.&lt;/p&gt;&lt;p&gt;To re&#xAD;duce this am&#xAD;bi&#xAD;guity, the tool pre&#xAD;sented here uses hu&#xAD;man-an&#xAD;chored refer&#xAD;ence cat&#xAD;e&#xAD;gories, for&#xAD;mally in&#xAD;tro&#xAD;duced in this text, and iden&#xAD;ti&#xAD;fied by the (h) suffix (Box 1). Th&#xAD;ese cat&#xAD;e&#xAD;gories do not re&#xAD;place the usual species-in&#xAD;ter&#xAD;nal cat&#xAD;e&#xAD;gories used within WFF analy&#xAD;ses; they only come into play when cross-species com&#xAD;par&#xAD;i&#xAD;son is re&#xAD;quired.&lt;/p&gt;&lt;p&gt;The pur&#xAD;pose of the no&#xAD;ta&#xAD;tion is not to claim that two species ex&#xAD;pe&#xAD;rience the same in&#xAD;ten&#xAD;sity when&#xAD;ever they are mapped to the same refer&#xAD;ence cat&#xAD;e&#xAD;gory. Rather, it is to cre&#xAD;ate an ex&#xAD;plicit shared scale for ask&#xAD;ing a nar&#xAD;rower ques&#xAD;tion: if a species is sen&#xAD;tient, how high could its per&#xAD;ceived Pain in&#xAD;ten&#xAD;sity plau&#xAD;si&#xAD;bly go rel&#xAD;a&#xAD;tive to the in&#xAD;ten&#xAD;sity per&#xAD;ceived by hu&#xAD;mans?&lt;/p&gt;&lt;figure&gt;&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td colspan=&quot;1&quot; rowspan=&quot;1&quot;&gt;&lt;p&gt;&lt;strong&gt;Box 1. Hu&#xAD;man-An&#xAD;chored Affec&#xAD;tive In&#xAD;ten&#xAD;sity Cat&#xAD;e&#xAD;gories as Ab&#xAD;solute Refer&#xAD;ence Points&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;A core challenge in in&#xAD;ter&#xAD;spe&#xAD;cific welfare anal&#xAD;y&#xAD;sis is how to com&#xAD;pare af&#xAD;fec&#xAD;tive in&#xAD;ten&#xAD;sity across species. To ad&#xAD;dress this, the pre&#xAD;sent tool uses hu&#xAD;man-an&#xAD;chored nega&#xAD;tive-af&#xAD;fect refer&#xAD;ence cat&#xAD;e&#xAD;gories, marked by the suffix (h): An&#xAD;noy&#xAD;ing(h), Hurt&#xAD;ful(h), Dis&#xAD;abling(h), and Ex&#xAD;cru&#xAD;ci&#xAD;at&#xAD;ing(h). Th&#xAD;ese serve as ab&#xAD;solute refer&#xAD;ence points for ceiling anal&#xAD;y&#xAD;sis of Pain in&#xAD;ten&#xAD;sity.&lt;/p&gt;&lt;p&gt;This no&#xAD;ta&#xAD;tion is dis&#xAD;tinct from the stan&#xAD;dard Welfare Foot&#xAD;print Frame&#xAD;work use of &lt;a href=&quot;https://welfarefootprint.org/technical-definitions/pain-intensities/&quot;&gt;An&#xAD;noy&#xAD;ing, Hurt&#xAD;ful, Dis&#xAD;abling, and Ex&#xAD;cru&#xAD;ci&#xAD;at&#xAD;ing&lt;/a&gt; as cat&#xAD;e&#xAD;gories that are in&#xAD;ter&#xAD;nal to a species’ own in&#xAD;di&#xAD;ca&#xAD;tors (e.g., be&#xAD;havi&#xAD;oural, neu&#xAD;ro&#xAD;phys&#xAD;iolog&#xAD;i&#xAD;cal). Within a given species, these la&#xAD;bels re&#xAD;fer only to rel&#xAD;a&#xAD;tive differ&#xAD;ences in af&#xAD;fec&#xAD;tive in&#xAD;ten&#xAD;sity in&#xAD;side that species and do not im&#xAD;ply cross-species equiv&#xAD;alence. By con&#xAD;trast, the (h) cat&#xAD;e&#xAD;gories are used only when ex&#xAD;plic&#xAD;itly mak&#xAD;ing in&#xAD;ter&#xAD;spe&#xAD;cific in&#xAD;fer&#xAD;ences.&lt;/p&gt;&lt;p&gt;Ac&#xAD;cord&#xAD;ingly, Ex&#xAD;cru&#xAD;ci&#xAD;at&#xAD;ing Pain in shrimp refers to the high&#xAD;est pain state iden&#xAD;ti&#xAD;fi&#xAD;able within shrimp, whereas Ex&#xAD;cru&#xAD;ci&#xAD;at&#xAD;ing(h) refers to the hu&#xAD;man refer&#xAD;ence level of pain as&#xAD;so&#xAD;ci&#xAD;ated with ex&#xAD;treme phe&#xAD;nomenolog&#xAD;i&#xAD;cal in&#xAD;ten&#xAD;sity and se&#xAD;vere func&#xAD;tional dis&#xAD;rup&#xAD;tion. Whether the former plau&#xAD;si&#xAD;bly reaches the lat&#xAD;ter is an em&#xAD;piri&#xAD;cal and the&#xAD;o&#xAD;ret&#xAD;i&#xAD;cal ques&#xAD;tion, not a defi&#xAD;ni&#xAD;tional one.&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/figure&gt;&lt;h2 data-internal-id=&quot;What_the_tool_does__and_does_not&quot;&gt;What the tool does, and does not&lt;/h2&gt;&lt;p&gt;The &lt;a href=&quot;https://chatgpt.com/g/g-695c117020e48191ab684c1caea2964a-ps-interspecific-affect&quot;&gt;In&#xAD;ter&#xAD;spe&#xAD;cific Affect GPT&lt;/a&gt; is de&#xAD;signed to an&#xAD;swer a nar&#xAD;row ques&#xAD;tion:&lt;/p&gt;&lt;p&gt;&lt;i&gt;If a sys&#xAD;tem is sen&#xAD;tient, what is the plau&#xAD;si&#xAD;ble up&#xAD;per bound of the in&#xAD;ten&#xAD;sity of Pain it could ex&#xAD;pe&#xAD;rience, rel&#xAD;a&#xAD;tive to a hu&#xAD;man-an&#xAD;chored refer&#xAD;ence scale?&lt;/i&gt;&lt;/p&gt;&lt;p&gt;That is its cen&#xAD;tral task. It does not com&#xAD;pute moral weights, pro&#xAD;duce rank&#xAD;ings, or gen&#xAD;er&#xAD;ate pri&#xAD;ori&#xAD;ti&#xAD;za&#xAD;tion recom&#xAD;men&#xAD;da&#xAD;tions. It does not quan&#xAD;tify the mag&#xAD;ni&#xAD;tude of species-to-species differ&#xAD;ences in af&#xAD;fec&#xAD;tive ca&#xAD;pac&#xAD;ity. Nor does it claim to set&#xAD;tle sen&#xAD;tience de&#xAD;bates. In&#xAD;stead, it aims to clar&#xAD;ify one up&#xAD;stream sci&#xAD;en&#xAD;tific in&#xAD;put upon which many down&#xAD;stream eth&#xAD;i&#xAD;cal analy&#xAD;ses rely, of&#xAD;ten with&#xAD;out mak&#xAD;ing that in&#xAD;put ex&#xAD;plicit.&lt;/p&gt;&lt;p&gt;For that rea&#xAD;son, the tool is best un&#xAD;der&#xAD;stood not as a calcu&#xAD;la&#xAD;tor but as a struc&#xAD;tured rea&#xAD;son&#xAD;ing scaf&#xAD;fold de&#xAD;signed to pre&#xAD;vent pre&#xAD;ma&#xAD;ture con&#xAD;ver&#xAD;gence on com&#xAD;fortable an&#xAD;swers. Its goal is not to re&#xAD;place ex&#xAD;per&#xAD;tise, but to make ex&#xAD;per&#xAD;tise eas&#xAD;ier to ap&#xAD;ply by forc&#xAD;ing as&#xAD;sump&#xAD;tions, in&#xAD;fer&#xAD;en&#xAD;tial steps, and dis&#xAD;agree&#xAD;ments into a more ex&#xAD;plicit and in&#xAD;spectable form.&lt;/p&gt;&lt;p&gt;We wel&#xAD;come crit&#xAD;i&#xAD;cism and feed&#xAD;back on the tool it&#xAD;self, as well as on the rea&#xAD;son&#xAD;ing struc&#xAD;ture it em&#xAD;bod&#xAD;ies. In a field as sci&#xAD;en&#xAD;tifi&#xAD;cally difficult and eth&#xAD;i&#xAD;cally sen&#xAD;si&#xAD;tive as in&#xAD;ter&#xAD;species welfare com&#xAD;par&#xAD;i&#xAD;sons, such crit&#xAD;i&#xAD;cisms are very much needed and val&#xAD;ued.&lt;/p&gt;&lt;h2 data-internal-id=&quot;How_the_Interspecific_Affect_GPT_works&quot;&gt;How the In&#xAD;ter&#xAD;spe&#xAD;cific Affect GPT works&lt;/h2&gt;&lt;p&gt;&lt;i&gt;(Note: This de&#xAD;scrip&#xAD;tion re&#xAD;flects the cur&#xAD;rent in&#xAD;struc&#xAD;tion set as of this draft. Fu&#xAD;ture ver&#xAD;sions may evolve, but the ac&#xAD;count be&#xAD;low matches the pre&#xAD;sent pub&#xAD;lic ar&#xAD;chi&#xAD;tec&#xAD;ture)&lt;/i&gt;&lt;/p&gt;&lt;p&gt;The fol&#xAD;low&#xAD;ing in&#xAD;for&#xAD;ma&#xAD;tion is not re&#xAD;quired to use the tool—its in&#xAD;ter&#xAD;ac&#xAD;tions are de&#xAD;signed to be self-guided and self-ex&#xAD;plana&#xAD;tory—but is pro&#xAD;vided for read&#xAD;ers who want to un&#xAD;der&#xAD;stand its ra&#xAD;tio&#xAD;nale in more de&#xAD;tail. The tool is guided not only by its step&#xAD;wise in&#xAD;struc&#xAD;tion set, but also by a cu&#xAD;rated sup&#xAD;port&#xAD;ing knowl&#xAD;edge base. This in&#xAD;cludes method&#xAD;olog&#xAD;i&#xAD;cal notes, defi&#xAD;ni&#xAD;tions, ex&#xAD;am&#xAD;ples of good and bad out&#xAD;puts, and se&#xAD;lected the&#xAD;o&#xAD;ret&#xAD;i&#xAD;cal ma&#xAD;te&#xAD;ri&#xAD;als used to sta&#xAD;bi&#xAD;lize how the work&#xAD;flow is in&#xAD;ter&#xAD;preted across cases. In par&#xAD;tic&#xAD;u&#xAD;lar, the knowl&#xAD;edge base helps the tool keep dis&#xAD;tinct the sen&#xAD;tience gate, the af&#xAD;fec&#xAD;tive-ceiling anal&#xAD;y&#xAD;sis, and the mean&#xAD;ing of the hu&#xAD;man-an&#xAD;chored refer&#xAD;ence cat&#xAD;e&#xAD;gories. It is best un&#xAD;der&#xAD;stood not as a fixed database of an&#xAD;swers, but as a sup&#xAD;port&#xAD;ing layer that helps the model ap&#xAD;ply the frame&#xAD;work more con&#xAD;sis&#xAD;tently and trans&#xAD;par&#xAD;ently.&lt;/p&gt;&lt;p&gt;The work&#xAD;flow is struc&#xAD;tured so that the tool does not jump di&#xAD;rectly from a user’s query to a ceiling es&#xAD;ti&#xAD;mate. In&#xAD;stead, it sep&#xAD;a&#xAD;rates a se&#xAD;ries of ques&#xAD;tions that are of&#xAD;ten run to&#xAD;gether in in&#xAD;for&#xAD;mal dis&#xAD;cus&#xAD;sion: What tax&#xAD;o&#xAD;nomic scope is ap&#xAD;pro&#xAD;pri&#xAD;ate? What method&#xAD;olog&#xAD;i&#xAD;cal as&#xAD;sump&#xAD;tions are be&#xAD;ing made? At what level should sen&#xAD;tience be as&#xAD;sessed? What ev&#xAD;i&#xAD;dence bears speci&#xAD;fi&#xAD;cally on af&#xAD;fec&#xAD;tive ceiling rather than merely on re&#xAD;spon&#xAD;sive&#xAD;ness or no&#xAD;ci&#xAD;cep&#xAD;tion? Only af&#xAD;ter those is&#xAD;sues have been made ex&#xAD;plicit does the tool move to&#xAD;ward a pro&#xAD;vi&#xAD;sional ceiling es&#xAD;ti&#xAD;mate. The step&#xAD;wise de&#xAD;sign is meant to make the rea&#xAD;son&#xAD;ing eas&#xAD;ier to in&#xAD;spect, challenge, and re&#xAD;vise.&lt;/p&gt;&lt;h3 data-internal-id=&quot;Step_1__Input_framing_and_scope&quot;&gt;Step 1: In&#xAD;put fram&#xAD;ing and scope&lt;/h3&gt;&lt;p&gt;The user in&#xAD;puts a tar&#xAD;get taxon. The tool then de&#xAD;ter&#xAD;mines, where nec&#xAD;es&#xAD;sary, two re&#xAD;lated but dis&#xAD;tinct goals: one for the like&#xAD;li&#xAD;hood of sen&#xAD;tience and one for the af&#xAD;fec&#xAD;tive-ca&#xAD;pac&#xAD;ity anal&#xAD;y&#xAD;sis. This dis&#xAD;tinc&#xAD;tion mat&#xAD;ters be&#xAD;cause ev&#xAD;i&#xAD;dence rele&#xAD;vant to sen&#xAD;tience may some&#xAD;times be available at a broader tax&#xAD;o&#xAD;nomic level than ev&#xAD;i&#xAD;dence rele&#xAD;vant to af&#xAD;fec&#xAD;tive ceiling. The cur&#xAD;rent in&#xAD;struc&#xAD;tion set there&#xAD;fore al&#xAD;lows the sen&#xAD;tience scope to be broader when the ev&#xAD;i&#xAD;dence base is broader, while keep&#xAD;ing the af&#xAD;fec&#xAD;tive-ca&#xAD;pac&#xAD;ity scope as close as pos&#xAD;si&#xAD;ble to the user’s tar&#xAD;get un&#xAD;less broader gen&#xAD;er&#xAD;al&#xAD;iza&#xAD;tion is nec&#xAD;es&#xAD;sary. In both cases, the tool is ex&#xAD;pected to jus&#xAD;tify the tax&#xAD;o&#xAD;nomic scope in terms of the ev&#xAD;i&#xAD;dence available and po&#xAD;ten&#xAD;tial het&#xAD;ero&#xAD;gene&#xAD;ity among the species in the tar&#xAD;get group.&lt;/p&gt;&lt;h3 data-internal-id=&quot;Step_2__Methodological_commitments_check&quot;&gt;Step 2: Method&#xAD;olog&#xAD;i&#xAD;cal com&#xAD;mit&#xAD;ments check&lt;/h3&gt;&lt;p&gt;Be&#xAD;fore pro&#xAD;ceed&#xAD;ing, the tool makes its key method&#xAD;olog&#xAD;i&#xAD;cal com&#xAD;mit&#xAD;ments ex&#xAD;plicit and asks whether the user wishes to keep or mod&#xAD;ify them. Th&#xAD;ese in&#xAD;clude: the Welfare Foot&#xAD;print Frame&#xAD;work use of Pain and Plea&#xAD;sure as um&#xAD;brella terms for nega&#xAD;tive and pos&#xAD;i&#xAD;tive af&#xAD;fec&#xAD;tive states; the use of an epistemic sen&#xAD;tience clas&#xAD;sifi&#xAD;ca&#xAD;tion rather than a nu&#xAD;mer&#xAD;i&#xAD;cal sen&#xAD;tience score; a biolog&#xAD;i&#xAD;cal par&#xAD;si&#xAD;mony de&#xAD;fault in the ab&#xAD;sence of pos&#xAD;i&#xAD;tive ev&#xAD;i&#xAD;dence for broader or more ex&#xAD;treme af&#xAD;fec&#xAD;tive ranges; the rule against in&#xAD;fer&#xAD;ring one taxon’s ceiling from an&#xAD;other with&#xAD;out ex&#xAD;plicit jus&#xAD;tifi&#xAD;ca&#xAD;tion; and the de&#xAD;ci&#xAD;sion, for pre&#xAD;sent pur&#xAD;poses, to an&#xAD;a&#xAD;lyze in&#xAD;ten&#xAD;sity ceiling prior to po&#xAD;ten&#xAD;tial in&#xAD;ter&#xAD;spe&#xAD;cific differ&#xAD;ences in the sub&#xAD;jec&#xAD;tive per&#xAD;cep&#xAD;tion of time. This step is im&#xAD;por&#xAD;tant be&#xAD;cause it turns as&#xAD;sump&#xAD;tions that are of&#xAD;ten left im&#xAD;plicit into as&#xAD;sump&#xAD;tions that can be crit&#xAD;i&#xAD;cized, re&#xAD;vised, or re&#xAD;jected be&#xAD;fore the anal&#xAD;y&#xAD;sis pro&#xAD;ceeds.&lt;/p&gt;&lt;h3 data-internal-id=&quot;Step_3__Sentience_plausibility_and_operational_gate&quot;&gt;Step 3: Sen&#xAD;tience plau&#xAD;si&#xAD;bil&#xAD;ity and op&#xAD;er&#xAD;a&#xAD;tional gate&lt;/h3&gt;&lt;p&gt;Us&#xAD;ing the sen&#xAD;tience scope defined in Step 1, the tool con&#xAD;ducts an ini&#xAD;tial ev&#xAD;i&#xAD;dence-based ap&#xAD;praisal of sen&#xAD;tience sta&#xAD;tus and clas&#xAD;sifies it as Plau&#xAD;si&#xAD;ble, Con&#xAD;tested/​Uncer&#xAD;tain, or Not co&#xAD;her&#xAD;ent. This step is in&#xAD;ten&#xAD;tion&#xAD;ally brief and clas&#xAD;sifi&#xAD;ca&#xAD;tory rather than a full liter&#xAD;a&#xAD;ture re&#xAD;view. Im&#xAD;por&#xAD;tantly, the tool treats sen&#xAD;tience pri&#xAD;mar&#xAD;ily at the broader or an&#xAD;ces&#xAD;tral tax&#xAD;o&#xAD;nomic scope se&#xAD;lected in Step 1, rather than re&#xAD;open&#xAD;ing it by de&#xAD;fault at the level of the nar&#xAD;rower tar&#xAD;get taxon. When the broader sen&#xAD;tience scope is sup&#xAD;ported by con&#xAD;ver&#xAD;gent ev&#xAD;i&#xAD;dence, nested taxa are or&#xAD;di&#xAD;nar&#xAD;ily treated as in&#xAD;her&#xAD;it&#xAD;ing that sta&#xAD;tus un&#xAD;less there is strong sub&#xAD;group-spe&#xAD;cific coun&#xAD;terev&#xAD;i&#xAD;dence or a biolog&#xAD;i&#xAD;cally se&#xAD;ri&#xAD;ous rea&#xAD;son to sus&#xAD;pect loss or rad&#xAD;i&#xAD;cal di&#xAD;ver&#xAD;gence.&lt;/p&gt;&lt;p&gt;The cur&#xAD;rent in&#xAD;struc&#xAD;tions di&#xAD;rect the GPT to con&#xAD;sult Fein&#xAD;berg’s From Sens&#xAD;ing to Sen&#xAD;tience as one rele&#xAD;vant source at this gate-set&#xAD;ting stage, while cross-check&#xAD;ing against other the&#xAD;o&#xAD;ret&#xAD;i&#xAD;cal frame&#xAD;works and em&#xAD;piri&#xAD;cal ev&#xAD;i&#xAD;dence.&lt;/p&gt;&lt;h3 data-internal-id=&quot;Step_4__Review_of_Evidence_and_indicators_&quot;&gt;Step 4: Re&#xAD;view of Ev&#xAD;i&#xAD;dence and indicators&lt;/h3&gt;&lt;p&gt;Us&#xAD;ing the af&#xAD;fec&#xAD;tive-ca&#xAD;pac&#xAD;ity scope defined in Step 1 as the pri&#xAD;mary unit of anal&#xAD;y&#xAD;sis, the tool as&#xAD;sem&#xAD;bles the ev&#xAD;i&#xAD;dence speci&#xAD;fi&#xAD;cally to in&#xAD;form the af&#xAD;fec&#xAD;tive ceiling ques&#xAD;tion: what level of in&#xAD;ten&#xAD;sity this sys&#xAD;tem could plau&#xAD;si&#xAD;bly reach. It maps ev&#xAD;i&#xAD;dence across be&#xAD;havi&#xAD;oural, neu&#xAD;ral ar&#xAD;chi&#xAD;tec&#xAD;tural, neu&#xAD;ro&#xAD;phys&#xAD;iolog&#xAD;i&#xAD;cal, phar&#xAD;ma&#xAD;colog&#xAD;i&#xAD;cal, cog&#xAD;ni&#xAD;tive/​rep&#xAD;re&#xAD;sen&#xAD;ta&#xAD;tional, and evolu&#xAD;tion&#xAD;ary do&#xAD;mains, while al&#xAD;low&#xAD;ing broader ev&#xAD;i&#xAD;dence only when its rele&#xAD;vance to the nar&#xAD;rower taxon is ex&#xAD;plic&#xAD;itly jus&#xAD;tified.&lt;/p&gt;&lt;p&gt;For each rele&#xAD;vant in&#xAD;di&#xAD;ca&#xAD;tor, the tool must state what con&#xAD;struct it bears on, whether it in&#xAD;forms sen&#xAD;si&#xAD;tivity, ca&#xAD;pac&#xAD;ity, both, or nei&#xAD;ther, what its main strengths and limi&#xAD;ta&#xAD;tions are, and where it may mis&#xAD;lead through false pos&#xAD;i&#xAD;tives or false nega&#xAD;tives. The point is not merely to list ev&#xAD;i&#xAD;dence, but to clar&#xAD;ify what in&#xAD;fer&#xAD;en&#xAD;tial work each in&#xAD;di&#xAD;ca&#xAD;tor can and can&#xAD;not do in con&#xAD;strain&#xAD;ing the plau&#xAD;si&#xAD;ble af&#xAD;fec&#xAD;tive ceiling. The in&#xAD;struc&#xAD;tions also make ex&#xAD;plicit that no&#xAD;ci&#xAD;cep&#xAD;tion, defen&#xAD;sive be&#xAD;havi&#xAD;our, or sim&#xAD;ple stim&#xAD;u&#xAD;lus re&#xAD;spon&#xAD;sive&#xAD;ness do not by them&#xAD;selves es&#xAD;tab&#xAD;lish ad&#xAD;vanced af&#xAD;fec&#xAD;tive ca&#xAD;pac&#xAD;ity.&lt;/p&gt;&lt;h3 data-internal-id=&quot;Step_5__Affective_capacity_scope&quot;&gt;Step 5: Affec&#xAD;tive ca&#xAD;pac&#xAD;ity scope&lt;/h3&gt;&lt;p&gt;Only af&#xAD;ter the ev&#xAD;i&#xAD;dence has been or&#xAD;ga&#xAD;nized does the tool ask what sort of af&#xAD;fec&#xAD;tive ar&#xAD;chi&#xAD;tec&#xAD;ture the tar&#xAD;get plau&#xAD;si&#xAD;bly has. Here the em&#xAD;pha&#xAD;sis is on whether the biolog&#xAD;i&#xAD;cal hard&#xAD;ware plau&#xAD;si&#xAD;bly sup&#xAD;ports co&#xAD;or&#xAD;di&#xAD;nated, high-in&#xAD;ten&#xAD;sity af&#xAD;fec&#xAD;tive in&#xAD;te&#xAD;gra&#xAD;tion, whether through cen&#xAD;tral&#xAD;ized mechanisms or func&#xAD;tion&#xAD;ally in&#xAD;te&#xAD;grated dis&#xAD;tributed ones. This step is meant to as&#xAD;sess ar&#xAD;chi&#xAD;tec&#xAD;tural sup&#xAD;port and its limits, while dis&#xAD;t&#xAD;in&#xAD;guish&#xAD;ing di&#xAD;rect sup&#xAD;port from spec&#xAD;u&#xAD;la&#xAD;tive ex&#xAD;trap&#xAD;o&#xAD;la&#xAD;tion; it is not yet the stage at which a fi&#xAD;nal or near-fi&#xAD;nal hu&#xAD;man-an&#xAD;chored ceiling is as&#xAD;signed.&lt;/p&gt;&lt;h3 data-internal-id=&quot;Step_6__Ceiling_inferences_and_three_mandatory_checks&quot;&gt;Step 6: Ceiling in&#xAD;fer&#xAD;ences and three manda&#xAD;tory checks&lt;/h3&gt;&lt;p&gt;At this stage the tool pro&#xAD;poses a pro&#xAD;vi&#xAD;sional hu&#xAD;man-an&#xAD;chored ceiling, ei&#xAD;ther as a sin&#xAD;gle cat&#xAD;e&#xAD;gory or a range, and then stress-tests it in three ways. First, through a pro&#xAD;cess we re&#xAD;fer to as “Cost of In&#xAD;ten&#xAD;sity Check” by ask&#xAD;ing: would states analo&#xAD;gous to Dis&#xAD;abling(h) or Ex&#xAD;cru&#xAD;ci&#xAD;at&#xAD;ing(h) be evolu&#xAD;tion&#xAD;ar&#xAD;ily and biolog&#xAD;i&#xAD;cally jus&#xAD;tified by the species’ life his&#xAD;tory and metabolic bud&#xAD;get?&lt;/p&gt;&lt;p&gt;Se&#xAD;cond, it ap&#xAD;plies an “Alter&#xAD;na&#xAD;tive Hy&#xAD;poth&#xAD;e&#xAD;sis Check”. The frame&#xAD;work de&#xAD;faults to nar&#xAD;row ceiling es&#xAD;ti&#xAD;mates in the ab&#xAD;sence of pos&#xAD;i&#xAD;tive ev&#xAD;i&#xAD;dence for broad af&#xAD;fec&#xAD;tive range, con&#xAD;sis&#xAD;tent with biolog&#xAD;i&#xAD;cal par&#xAD;si&#xAD;mony. At the same time, the tool checks a spe&#xAD;cific counter-ar&#xAD;gu&#xAD;ment: could a lack of reg&#xAD;u&#xAD;la&#xAD;tory con&#xAD;trol over af&#xAD;fec&#xAD;tive states re&#xAD;sult in in&#xAD;tense but poorly mod&#xAD;u&#xAD;lated nega&#xAD;tive ex&#xAD;pe&#xAD;rience? This pos&#xAD;si&#xAD;bil&#xAD;ity is treated as spec&#xAD;u&#xAD;la&#xAD;tive un&#xAD;less the ev&#xAD;i&#xAD;dence speci&#xAD;fi&#xAD;cally sup&#xAD;ports it, but it is not dis&#xAD;missed—the ab&#xAD;sence of ev&#xAD;i&#xAD;dence for a high ceiling is not treated as ev&#xAD;i&#xAD;dence of its absence&lt;/p&gt;&lt;p&gt;Third, the tool uses a fi&#xAD;nal “Con&#xAD;ver&#xAD;gence Check”: if the ev&#xAD;i&#xAD;dence is in con&#xAD;flict&#xAD;ing di&#xAD;rec&#xAD;tions, the out&#xAD;put should widen the un&#xAD;cer&#xAD;tainty bounds rather than force an ar&#xAD;tifi&#xAD;cially crisp an&#xAD;swer.&lt;/p&gt;&lt;h3 data-internal-id=&quot;Step_7__Red_team___steel_man&quot;&gt;Step 7: Red-team /​ steel-man&lt;/h3&gt;&lt;p&gt;The tool must then con&#xAD;struct the strongest plau&#xAD;si&#xAD;ble case against its own con&#xAD;clu&#xAD;sion. Depend&#xAD;ing on the case, this may in&#xAD;volve ar&#xAD;gu&#xAD;ing that the cur&#xAD;rent es&#xAD;ti&#xAD;mate is too low, too high, or oth&#xAD;er&#xAD;wise too con&#xAD;fi&#xAD;dently bounded. The aim is to ex&#xAD;pose where the con&#xAD;clu&#xAD;sion is most vuln&#xAD;er&#xAD;a&#xAD;ble to rein&#xAD;ter&#xAD;pre&#xAD;ta&#xAD;tion and to en&#xAD;sure that the fi&#xAD;nal out&#xAD;put re&#xAD;flects en&#xAD;gage&#xAD;ment with the strongest se&#xAD;ri&#xAD;ous challenge, rather than pre&#xAD;ma&#xAD;ture con&#xAD;ver&#xAD;gence on a com&#xAD;fortable an&#xAD;swer.&lt;/p&gt;&lt;h3 data-internal-id=&quot;Step_8__Final_dossier__subjective_time__and_research_priorities&quot;&gt;Step 8: Fi&#xAD;nal dossier, sub&#xAD;jec&#xAD;tive time, and re&#xAD;search priorities&lt;/h3&gt;&lt;p&gt;The fi&#xAD;nal out&#xAD;put is not a moral weight or a rank&#xAD;ing of species, but a dossier con&#xAD;tain&#xAD;ing the sen&#xAD;tience-plau&#xAD;si&#xAD;bil&#xAD;ity judg&#xAD;ment, the af&#xAD;fec&#xAD;tive-ceiling es&#xAD;ti&#xAD;mate, a brief as&#xAD;sess&#xAD;ment of whether sub&#xAD;jec&#xAD;tive time could be&#xAD;come rele&#xAD;vant in later anal&#xAD;y&#xAD;sis, a re&#xAD;search-pri&#xAD;or&#xAD;ity judg&#xAD;ment iden&#xAD;ti&#xAD;fy&#xAD;ing the sin&#xAD;gle ex&#xAD;per&#xAD;i&#xAD;ment most likely to change the con&#xAD;clu&#xAD;sion, and a dis&#xAD;agree&#xAD;ment log sum&#xAD;ma&#xAD;riz&#xAD;ing the user’s ob&#xAD;jec&#xAD;tions.&lt;/p&gt;&lt;p&gt;At the end of each step, the users are asked whether they agree, dis&#xAD;agree, or want to re&#xAD;vise any part of the rea&#xAD;son&#xAD;ing be&#xAD;fore the next step. Th&#xAD;ese in&#xAD;ter&#xAD;ac&#xAD;tions are de&#xAD;signed to trans&#xAD;form dis&#xAD;agree&#xAD;ment into more ex&#xAD;plicit and in&#xAD;spectable ar&#xAD;gu&#xAD;ments, re&#xAD;duc&#xAD;ing the im&#xAD;pres&#xAD;sion of false cer&#xAD;tainty.&lt;/p&gt;&lt;h2 data-internal-id=&quot;Selected_Readings_and_References&quot;&gt;Selected Read&#xAD;ings and References&lt;/h2&gt;&lt;p&gt;Alonso, W. J., &amp;amp; Schuck-Paim, C. (2023). &lt;a href=&quot;https://welfarefootprint.org/2023/05/31/welfare-metrics-vs-welfare-indicators/&quot;&gt;Welfare met&#xAD;rics and welfare in&#xAD;di&#xAD;ca&#xAD;tors: Clar&#xAD;ify&#xAD;ing es&#xAD;sen&#xAD;tial con&#xAD;cepts in an&#xAD;i&#xAD;mal welfare as&#xAD;sess&#xAD;ment&lt;/a&gt;. &lt;a href=&quot;https://doi.org/10.17605/OSF.IO/AQ2BM&quot; class=&quot;bare-url&quot;&gt;https://​​doi.org/​​10.17605/​​OSF.IO/​​AQ2BM&lt;/a&gt;&lt;a href=&quot;https://doi.org/10.17605/OSF.IO/AQ2BM&quot;&gt;&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Alonso, W. J., &amp;amp; Schuck-Paim, C. (2025). &lt;a href=&quot;https://osf.io/94bxs/overview&quot;&gt;Welfare Foot&#xAD;print Frame&#xAD;work: Method&#xAD;olog&#xAD;i&#xAD;cal foun&#xAD;da&#xAD;tions and quan&#xAD;ti&#xAD;ta&#xAD;tive as&#xAD;sess&#xAD;ment guidelines&lt;/a&gt;. &lt;a href=&quot;https://doi.org/10.17605/osf.io/94bxs&quot; class=&quot;bare-url&quot;&gt;https://​​doi.org/​​10.17605/​​osf.io/​​94bxs&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Alonso, W. J., &amp;amp; Schuck-Paim, C. (2025a). &lt;a href=&quot;https://+https//forum.effectivealtruism.org/posts/novnNcFiWaaAvTKEi/do-primitive-sentient-organisms-feel-extreme-pain&quot;&gt;Do prim&#xAD;i&#xAD;tive sen&#xAD;tient or&#xAD;ganisms feel ex&#xAD;treme pain? Disen&#xAD;tan&#xAD;gling in&#xAD;ten&#xAD;sity range and re&#xAD;s&#xAD;olu&#xAD;tion&lt;/a&gt;. EA Fo&#xAD;rum.&lt;/p&gt;&lt;p&gt;Alonso, W. J., &amp;amp; Schuck-Paim, C. (2025b). &lt;a href=&quot;/posts/Ey9JyujS56wHvQZKP/when-feeling-is-worth-it-a-cost-benefit-framework-for-the&quot;&gt;When Feel&#xAD;ing Is Worth It: A Cost–Benefit Frame&#xAD;work for the Evolu&#xAD;tion of Sen&#xAD;tience&lt;/a&gt;. EA Fo&#xAD;rum.&lt;a href=&quot;/posts/Ey9JyujS56wHvQZKP/when-feeling-is-worth-it-a-cost-benefit-framework-for-the&quot;&gt;&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Birch, J. (2024). The edge of sen&#xAD;tience: Risk and pre&#xAD;cau&#xAD;tion in hu&#xAD;mans, other an&#xAD;i&#xAD;mals, and AI. Oxford Univer&#xAD;sity Press.&lt;/p&gt;&lt;p&gt;Brown&#xAD;ing, H., &amp;amp; Birch, J. (2022). An&#xAD;i&#xAD;mal sen&#xAD;tience. Philos&#xAD;o&#xAD;phy Com&#xAD;pass, 17(5), e12822.&lt;/p&gt;&lt;p&gt;Church&#xAD;land, P. S. (2002). Brain-wise: Stud&#xAD;ies in neu&#xAD;rophilos&#xAD;o&#xAD;phy. MIT Press.&lt;/p&gt;&lt;p&gt;Crump, A., Brown&#xAD;ing, H., Sch&#xAD;nell, A. K., Burn, C., &amp;amp; Birch, J. (2022). Sen&#xAD;tience in de&#xAD;ca&#xAD;pod crus&#xAD;taceans: A gen&#xAD;eral frame&#xAD;work and re&#xAD;view of the ev&#xAD;i&#xAD;dence. An&#xAD;i&#xAD;mal Sen&#xAD;tience, 7(32), 1.&lt;/p&gt;&lt;p&gt;Da&#xAD;ma&#xAD;sio, A. (1999). The feel&#xAD;ing of what hap&#xAD;pens: Body and emo&#xAD;tion in the mak&#xAD;ing of con&#xAD;scious&#xAD;ness. Har&#xAD;court.&lt;/p&gt;&lt;p&gt;Da&#xAD;ma&#xAD;sio, A. (2010). Self comes to mind: Con&#xAD;struct&#xAD;ing the con&#xAD;scious brain. Pan&#xAD;theon.&lt;/p&gt;&lt;p&gt;El&#xAD;wood, R. W. (2011). Pain and suffer&#xAD;ing in in&#xAD;ver&#xAD;te&#xAD;brates? ILAR Jour&#xAD;nal, 52(2), 175–184.&lt;/p&gt;&lt;p&gt;Fein&#xAD;berg, T. E. (2024). &lt;a href=&quot;https://direct.mit.edu/books/oa-monograph/5839/From-Sensing-to-SentienceHow-Feeling-Emerges-from&quot;&gt;From sens&#xAD;ing to sen&#xAD;tience: How feel&#xAD;ing emerges from the brain&lt;/a&gt;. MIT Press.&lt;/p&gt;&lt;p&gt;Fein&#xAD;berg, T. E., &amp;amp; Mal&#xAD;latt, J. (2016). The an&#xAD;cient ori&#xAD;gins of con&#xAD;scious&#xAD;ness: How the brain cre&#xAD;ated ex&#xAD;pe&#xAD;rience. MIT Press.&lt;/p&gt;&lt;p&gt;Gins&#xAD;burg, S., &amp;amp; Jablonka, E. (2019). The evolu&#xAD;tion of the sen&#xAD;si&#xAD;tive soul: Learn&#xAD;ing and the ori&#xAD;gins of con&#xAD;scious&#xAD;ness. MIT Press.&lt;/p&gt;&lt;p&gt;Gins&#xAD;burg, S., &amp;amp; Jablonka, E. (2022). Pain sen&#xAD;tience crite&#xAD;ria and their grad&#xAD;ing. An&#xAD;i&#xAD;mal Sen&#xAD;tience, 7(32), 13.&lt;/p&gt;&lt;p&gt;God&#xAD;frey-Smith, P. (2016). Other minds: The oc&#xAD;to&#xAD;pus, the sea, and the deep ori&#xAD;gins of con&#xAD;scious&#xAD;ness. Far&#xAD;rar, Straus and Giroux.&lt;/p&gt;&lt;p&gt;God&#xAD;frey-Smith, P. (2020). Me&#xAD;ta&#xAD;zoa: An&#xAD;i&#xAD;mal life and the birth of the mind. Far&#xAD;rar, Straus and Giroux.&lt;/p&gt;&lt;p&gt;Panksepp, J. (2005). Affec&#xAD;tive con&#xAD;scious&#xAD;ness: Core emo&#xAD;tional feel&#xAD;ings in an&#xAD;i&#xAD;mals and hu&#xAD;mans. Con&#xAD;scious&#xAD;ness and Cog&#xAD;ni&#xAD;tion, 14(1), 30–80.&lt;/p&gt;&lt;p&gt;Sned&#xAD;don, L. U., El&#xAD;wood, R. W., Adamo, S. A., &amp;amp; Leach, M. C. (2014). Defin&#xAD;ing and as&#xAD;sess&#xAD;ing an&#xAD;i&#xAD;mal pain. An&#xAD;i&#xAD;mal Be&#xAD;havi&#xAD;our, 97, 201–212.&lt;/p&gt;&lt;p&gt;Schukraft, Ja&#xAD;son. 2020. &lt;a href=&quot;https://rethinkpriorities.org/research-area/differences-in-the-intensity-of-valenced-experience-across-species/&quot;&gt;Differ&#xAD;ences in the In&#xAD;ten&#xAD;sity of Valenced Ex&#xAD;pe&#xAD;rience across Species&lt;/a&gt;. Re&#xAD;think Pri&#xAD;ori&#xAD;ties. Oc&#xAD;to&#xAD;ber 29, 2020.&lt;/p&gt;&lt;p&gt;Solms, M. (2021). The hid&#xAD;den spring: A jour&#xAD;ney to the source of con&#xAD;scious&#xAD;ness. W. W. Nor&#xAD;ton &amp;amp; Com&#xAD;pany.&lt;/p&gt;</description>
            <author>Wladimir J. Alonso</author>
            <guid>KT5QyL2wsBNvw2s3t</guid>
            <pubDate>Fri, 10 Apr 2026 13:52:06 +0000</pubDate>
        </item>
        <item>
            <title>Fortify Health: Growth and Development Volunteer to Support CEO During Rapid Scale by Tony Senanayake</title>
            <link>https://forum.nunosempere.com/posts/iCYSjiQPLuQkPihxc/fortify-health-growth-and-development-volunteer-to-support</link>
            <description>&lt;p&gt;&lt;a href=&quot;https://docs.google.com/document/d/1bwvG2rY9K86y8fBP6vP-eCNdAsYl2wtO/edit#heading=h.1o8k2ke8wpgm&quot;&gt;For&#xAD;tify Health is hiring a part-time vol&#xAD;un&#xAD;teer&lt;/a&gt; (10–20h/​week, re&#xAD;mote) to work di&#xAD;rectly with me (the CEO) and our Chief of Staff on our growth and fundrais&#xAD;ing strat&#xAD;egy dur&#xAD;ing a key phase of scale.&lt;/p&gt;&lt;p&gt;&lt;br&gt;We fo&#xAD;cus on re&#xAD;duc&#xAD;ing iron defi&#xAD;ciency anaemia through large-scale wheat flour for&#xAD;tifi&#xAD;ca&#xAD;tion in In&#xAD;dia—an in&#xAD;ter&#xAD;ven&#xAD;tion with a strong ev&#xAD;i&#xAD;dence base and high cost-effec&#xAD;tive&#xAD;ness. Fol&#xAD;low&#xAD;ing re&#xAD;cent fund&#xAD;ing, we plan to ex&#xAD;pand to reach ~60 mil&#xAD;lion peo&#xAD;ple over the next three years. We are a team of al&#xAD;most 100 and plan&#xAD;ning to scale rapidly. &lt;/p&gt;&lt;p&gt;&lt;br&gt;This role offers un&#xAD;usu&#xAD;ally di&#xAD;rect ex&#xAD;po&#xAD;sure to high-stakes de&#xAD;ci&#xAD;sions par&#xAD;tic&#xAD;u&#xAD;larly over the next vew months. This is a fast-paced, high-ex&#xAD;po&#xAD;sure role suited to some&#xAD;one ex&#xAD;cited to ap&#xAD;ply ev&#xAD;i&#xAD;dence-driven think&#xAD;ing to real-world fundrais&#xAD;ing and growth.&lt;/p&gt;&lt;p&gt;&lt;br&gt;Here are some ex&#xAD;am&#xAD;ples of what the role will in&#xAD;volve:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Direct in&#xAD;volve&#xAD;ment in con&#xAD;ver&#xAD;sa&#xAD;tions with effec&#xAD;tive giv&#xAD;ing organisations&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Draft&#xAD;ing com&#xAD;mu&#xAD;ni&#xAD;ca&#xAD;tion ma&#xAD;te&#xAD;ri&#xAD;als on cost-effec&#xAD;tive&#xAD;ness to di&#xAD;verse fun&#xAD;der audiences&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Sup&#xAD;port&#xAD;ing real-time fundrais&#xAD;ing and part&#xAD;ner&#xAD;ship efforts with CEO&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;br&gt;For some&#xAD;one con&#xAD;sid&#xAD;er&#xAD;ing a path in EA-al&#xAD;igned op&#xAD;er&#xAD;a&#xAD;tions, com&#xAD;mu&#xAD;ni&#xAD;ca&#xAD;tions, or fundrais&#xAD;ing, this is a chance to build highly trans&#xAD;fer&#xAD;able skills while con&#xAD;tribut&#xAD;ing to a pro&#xAD;gramme already op&#xAD;er&#xAD;at&#xAD;ing at sig&#xAD;nifi&#xAD;cant scale.&lt;br&gt;&lt;br&gt;We think this role may be par&#xAD;tic&#xAD;u&#xAD;larly valuable for early-ca&#xAD;reer can&#xAD;di&#xAD;dates aiming to test fit for high-im&#xAD;pact non&#xAD;profit work, or to strengthen their pro&#xAD;file for fu&#xAD;ture EA roles.&lt;br&gt;&lt;br&gt;We would like the suc&#xAD;cess&#xAD;ful can&#xAD;di&#xAD;date to start im&#xAD;me&#xAD;di&#xAD;ately, and work un&#xAD;til June 2026 (&lt;strong&gt;with a real po&#xAD;ten&#xAD;tial to con&#xAD;vert to a paid role there&#xAD;after&lt;/strong&gt;).&lt;br&gt;&lt;br&gt;&lt;a href=&quot;https://docs.google.com/document/d/1bwvG2rY9K86y8fBP6vP-eCNdAsYl2wtO/edit#heading=h.1o8k2ke8wpgm&quot;&gt;Full de&#xAD;tails and ap&#xAD;pli&#xAD;ca&#xAD;tion shared in the at&#xAD;tached&lt;/a&gt;.&lt;/p&gt;</description>
            <author>Tony Senanayake</author>
            <guid>iCYSjiQPLuQkPihxc</guid>
            <pubDate>Fri, 10 Apr 2026 12:19:55 +0000</pubDate>
        </item>
        <item>
            <title>A community health extension worker in a primary health center in Nigeria sees around 40 to 60 patients a day by ANTHONIO OLADIMEJI</title>
            <link>https://forum.nunosempere.com/posts/znsgqwSKGxXrTvci9/a-community-health-extension-worker-in-a-primary-health</link>
            <description>&lt;p&gt;One pa&#xAD;tient comes in with fever.&lt;/p&gt;&lt;p&gt;No lab. No doc&#xAD;tor. No sec&#xAD;ond opinion.&lt;/p&gt;&lt;p&gt;Just symp&#xAD;toms.&lt;/p&gt;&lt;p&gt;It could be malaria.&lt;/p&gt;&lt;p&gt;It could be ty&#xAD;phoid.&lt;/p&gt;&lt;p&gt;It could be menin&#xAD;gitis.&lt;/p&gt;&lt;p&gt;It could be early sep&#xAD;sis.&lt;/p&gt;&lt;p&gt;Th&#xAD;ese can look al&#xAD;most iden&#xAD;ti&#xAD;cal at first pre&#xAD;sen&#xAD;ta&#xAD;tion. But the out&#xAD;comes are very differ&#xAD;ent.&lt;br&gt; &lt;/p&gt;&lt;p&gt;Men&#xAD;in&#xAD;gitis can kill within about 24 hours.&lt;/p&gt;&lt;p&gt;Sep&#xAD;sis can de&#xAD;te&#xAD;ri&#xAD;o&#xAD;rate within hours.&lt;/p&gt;&lt;p&gt;&lt;br&gt;A wrong de&#xAD;ci&#xAD;sion is not ab&#xAD;stract. It is time lost. In prac&#xAD;tice, many of these cases get treated as malaria first. Some&#xAD;times cor&#xAD;rectly. Some&#xAD;times not.&lt;/p&gt;&lt;p&gt;I have been try&#xAD;ing to un&#xAD;der&#xAD;stand whether AI based clini&#xAD;cal de&#xAD;ci&#xAD;sion sup&#xAD;port could mean&#xAD;ingfully im&#xAD;prove this kind of de&#xAD;ci&#xAD;sion mak&#xAD;ing in low re&#xAD;source set&#xAD;tings. Not in a the&#xAD;o&#xAD;ret&#xAD;i&#xAD;cal sense, but in the ex&#xAD;act en&#xAD;vi&#xAD;ron&#xAD;ment de&#xAD;scribed above.&lt;/p&gt;&lt;p&gt;Con&#xAD;straints mat&#xAD;ter a lot here:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;No in&#xAD;ter&#xAD;net in many facilities&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Limited drug availa&#xAD;bil&#xAD;ity based on na&#xAD;tional es&#xAD;sen&#xAD;tial medicines lists&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Health work&#xAD;ers with vary&#xAD;ing lev&#xAD;els of training&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;High pa&#xAD;tient load and very lit&#xAD;tle time per case&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;So any in&#xAD;ter&#xAD;ven&#xAD;tion has to work within those con&#xAD;straints or it does not work at all.&lt;/p&gt;&lt;p&gt;What I find difficult is not the tech&#xAD;ni&#xAD;cal side. It is the im&#xAD;pact ques&#xAD;tion. Even if a sys&#xAD;tem can sug&#xAD;gest bet&#xAD;ter differ&#xAD;en&#xAD;tials or high&#xAD;light dan&#xAD;ger signs, sev&#xAD;eral things could still go wrong:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;The health worker may not trust the recommendation&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;The re&#xAD;quired drug or refer&#xAD;ral path&#xAD;way may not be available&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Over re&#xAD;li&#xAD;ance could in&#xAD;tro&#xAD;duce new failure modes&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Im&#xAD;prove&#xAD;ments in de&#xAD;ci&#xAD;sion qual&#xAD;ity may not trans&#xAD;late into mea&#xAD;surable out&#xAD;come changes&lt;br&gt; &lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;There is also a broader ques&#xAD;tion of com&#xAD;par&#xAD;i&#xAD;son.&lt;/p&gt;&lt;p&gt;How does this ap&#xAD;proach com&#xAD;pare to: Bet&#xAD;ter train&#xAD;ing pro&#xAD;grams, Paper based triage tools, In&#xAD;creased su&#xAD;per&#xAD;vi&#xAD;sion, Sim&#xAD;ple pro&#xAD;to&#xAD;col re&#xAD;in&#xAD;force&#xAD;ment.&lt;/p&gt;&lt;p&gt;It is not ob&#xAD;vi&#xAD;ous that AI is the high&#xAD;est lev&#xAD;er&#xAD;age in&#xAD;ter&#xAD;ven&#xAD;tion here. I am es&#xAD;pe&#xAD;cially in&#xAD;ter&#xAD;ested in how to eval&#xAD;u&#xAD;ate some&#xAD;thing like this prop&#xAD;erly.&lt;/p&gt;&lt;p&gt;If the goal is to re&#xAD;duce missed high risk cases among fe&#xAD;brile pa&#xAD;tients, what is the most cred&#xAD;ible way to mea&#xAD;sure that in prac&#xAD;tice?&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Ran&#xAD;dom&#xAD;ized rol&#xAD;lout across fa&#xAD;cil&#xAD;ities?&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Be&#xAD;fore and af&#xAD;ter com&#xAD;par&#xAD;i&#xAD;sons?&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Proxy met&#xAD;rics like refer&#xAD;ral ac&#xAD;cu&#xAD;racy or time to es&#xAD;ca&#xAD;la&#xAD;tion?&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Each ap&#xAD;proach seems to have trade&#xAD;offs. More gen&#xAD;er&#xAD;ally, I am try&#xAD;ing to an&#xAD;swer a sim&#xAD;ple ques&#xAD;tion:&lt;/p&gt;&lt;p&gt;In this set&#xAD;ting, does AI de&#xAD;ci&#xAD;sion sup&#xAD;port mean&#xAD;ingfully re&#xAD;duce harm, or does it just change how de&#xAD;ci&#xAD;sions are made with&#xAD;out im&#xAD;prov&#xAD;ing out&#xAD;comes?&lt;/p&gt;&lt;p&gt;I would value per&#xAD;spec&#xAD;tives from peo&#xAD;ple who have worked on similar prob&#xAD;lems, es&#xAD;pe&#xAD;cially around eval&#xAD;u&#xAD;a&#xAD;tion de&#xAD;sign and failure modes in real world health&#xAD;care sys&#xAD;tems.&lt;/p&gt;</description>
            <author>ANTHONIO OLADIMEJI</author>
            <guid>znsgqwSKGxXrTvci9</guid>
            <pubDate>Fri, 10 Apr 2026 07:48:13 +0000</pubDate>
        </item>
        <item>
            <title>Resources for group organizers on supporting a good culture by Julia_Wise🔸</title>
            <link>https://forum.nunosempere.com/posts/HysnHrZYGjfoS2HMe/resources-for-group-organizers-on-supporting-a-good-culture</link>
            <description>&lt;p&gt;&lt;i&gt;From the &lt;/i&gt;&lt;a href=&quot;https://www.centreforeffectivealtruism.org/community-health&quot;&gt;&lt;i&gt;com&#xAD;mu&#xAD;nity health team&lt;/i&gt;&lt;/a&gt;&lt;i&gt; at the Cen&#xAD;tre for Effec&#xAD;tive Altru&#xAD;ism.&lt;/i&gt;&lt;br&gt;&lt;br&gt;We know a lot of group or&#xAD;ga&#xAD;niz&#xAD;ers care about build&#xAD;ing spaces where peo&#xAD;ple feel re&#xAD;spected and wel&#xAD;come, and where prob&#xAD;lems are dealt with well when they come up. We hear from or&#xAD;ga&#xAD;niz&#xAD;ers about this reg&#xAD;u&#xAD;larly, and we want to make sure you know what’s available.&lt;/p&gt;&lt;p&gt;Some of the re&#xAD;sources available: &lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Why and how you might have a &lt;a href=&quot;https://resources.eagroups.org/community-contacts&quot;&gt;com&#xAD;mu&#xAD;nity con&#xAD;tact person&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;https://resources.eagroups.org/diversity-inclusion&quot;&gt;Diver&#xAD;sity and inclusion&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;https://resources.eagroups.org/codes-of-conduct&quot;&gt;Codes of conduct&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;/posts/CtACh7xRBFnpK3NW4/guide-to-safe-and-inclusive-events-by-gwwc-and-oftw&quot;&gt;Guide to safe and in&#xAD;clu&#xAD;sive events&lt;/a&gt; from Giv&#xAD;ing What We Can and One for the World&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Lon&#xAD;don’s &lt;a href=&quot;/posts/znuJ2Z48YnEjrGLvA/why-do-ea-events-attract-more-men-than-women-focus-group&quot;&gt;fo&#xAD;cus group on gen&#xAD;der bal&#xAD;ance&lt;/a&gt; in groups&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Ideas from a work&#xAD;shop on &lt;a href=&quot;/posts/eaLwfhXbw2kNxA4es/bridging-ea-s-gender-gap-input-from-60-people&quot;&gt;gen&#xAD;der balance&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;More ideas are at the &lt;a href=&quot;https://resources.eagroups.org/&quot;&gt;EA Groups Re&#xAD;source Cen&#xAD;tre&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;If you’d like some&#xAD;thing more spe&#xAD;cific, or some&#xAD;thing that isn’t cov&#xAD;ered here, &lt;a href=&quot;https://www.centreforeffectivealtruism.org/community-health/contact-us&quot;&gt;let us know&lt;/a&gt; — we may have ex&#xAD;ist&#xAD;ing non-pub&#xAD;lic re&#xAD;sources rele&#xAD;vant to your situ&#xAD;a&#xAD;tion, and we’re always happy to talk through the speci&#xAD;fics. A lot of cul&#xAD;ture ques&#xAD;tions are con&#xAD;text-de&#xAD;pen&#xAD;dent and eas&#xAD;ier to work through in con&#xAD;ver&#xAD;sa&#xAD;tion.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Talk to us directly&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;If you or some&#xAD;one in your group has en&#xAD;coun&#xAD;tered a prob&#xAD;lem, we’re here to help. Some ex&#xAD;am&#xAD;ples of what that’s looked like:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;When some&#xAD;one has ex&#xAD;pe&#xAD;rienced harm, we’ve helped them think through their op&#xAD;tions and find re&#xAD;sources like le&#xAD;gal sup&#xAD;port.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;When pro&#xAD;gram staff have learned about a prob&#xAD;lem, we’ve helped them work out next steps — which might in&#xAD;clude sup&#xAD;port&#xAD;ing af&#xAD;fected peo&#xAD;ple, re&#xAD;mov&#xAD;ing a par&#xAD;ti&#xAD;ci&#xAD;pant, and han&#xAD;dling com&#xAD;mu&#xAD;ni&#xAD;ca&#xAD;tions.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;We’ve ad&#xAD;vised many group or&#xAD;ga&#xAD;niz&#xAD;ers, office man&#xAD;agers, and oth&#xAD;ers on how to build a good cul&#xAD;ture in their spaces.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Please have a low bar for reach&#xAD;ing out to us! It can be eas&#xAD;ier to solve a prob&#xAD;lem be&#xAD;fore it es&#xAD;ca&#xAD;lates. We aren’t here to tell you what to do; you know your group con&#xAD;text. &lt;/p&gt;&lt;p&gt;You can ask us ques&#xAD;tions about how we’d han&#xAD;dle some&#xAD;thing be&#xAD;fore shar&#xAD;ing any speci&#xAD;fics, and you can &lt;a href=&quot;https://www.centreforeffectivealtruism.org/community-health/contact-us&quot;&gt;con&#xAD;tact us&lt;/a&gt; &lt;u&gt;anony&#xAD;mously&lt;/u&gt;. &lt;/p&gt;&lt;p&gt;Our &lt;a href=&quot;https://www.centreforeffectivealtruism.org/community-health/confidentiality-policy&quot;&gt;con&#xAD;fi&#xAD;den&#xAD;tial&#xAD;ity policy&lt;/a&gt; is here. One thing to note: if you raise a con&#xAD;cern about some&#xAD;one who works at CEA, we’ll need to in&#xAD;form our le&#xAD;gal team and may need to hand the situ&#xAD;a&#xAD;tion over to HR.&lt;/p&gt;&lt;p&gt;We won’t be the right fit for ev&#xAD;ery&#xAD;one. But if you’ve en&#xAD;coun&#xAD;tered a prob&#xAD;lem and aren’t sure about seek&#xAD;ing sup&#xAD;port, we’d en&#xAD;courage you to talk to some&#xAD;one, whether that’s us, a friend, or some&#xAD;one else you trust.&lt;/p&gt;</description>
            <author>Julia_Wise🔸</author>
            <guid>HysnHrZYGjfoS2HMe</guid>
            <pubDate>Fri, 10 Apr 2026 03:03:39 +0000</pubDate>
        </item>
        <item>
            <title>Ideas for EA Cause Areas by James Brobin</title>
            <link>https://forum.nunosempere.com/posts/KqZqsanrB4oB5GC3q/ideas-for-ea-cause-areas</link>
            <description>&lt;p&gt;This is a cross&#xAD;post from my &lt;a href=&quot;https://substack.com/home/post/p-193742834&quot;&gt;blog ar&#xAD;ti&#xAD;cle&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;I think that be&#xAD;cause effec&#xAD;tive al&#xAD;tru&#xAD;ists tend to dis&#xAD;cuss a nar&#xAD;row range of cause ar&#xAD;eas, it can be easy for us to be&#xAD;lieve that we’ve already iden&#xAD;ti&#xAD;fied all of the most im&#xAD;por&#xAD;tant cause ar&#xAD;eas, which is al&#xAD;most cer&#xAD;tainly not the case. As such, in this post, I’m go&#xAD;ing to pro&#xAD;pose a cou&#xAD;ple of other cause ar&#xAD;eas by dis&#xAD;cussing what I would think are the world’s most im&#xAD;por&#xAD;tant cause ar&#xAD;eas if I were not already fa&#xAD;mil&#xAD;iar with EA. For each, I’ll in&#xAD;clude why I think it could be a pos&#xAD;si&#xAD;ble cause area as well as some pos&#xAD;si&#xAD;ble in&#xAD;ter&#xAD;ven&#xAD;tions for it.&lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;p&gt;Loneli&#xAD;ness in de&#xAD;vel&#xAD;oped nations&lt;ol&gt;&lt;li&gt;&lt;p&gt;Ex&#xAD;pla&#xAD;na&#xAD;tion:&lt;ol&gt;&lt;li&gt;&lt;p&gt;Over the past half cen&#xAD;tury, we’ve seen a dra&#xAD;matic in&#xAD;crease in loneli&#xAD;ness among peo&#xAD;ple in de&#xAD;vel&#xAD;oped na&#xAD;tions. So&#xAD;cial con&#xAD;nec&#xAD;tion is a core con&#xAD;trib&#xAD;u&#xAD;tor to health, wellbe&#xAD;ing, and pro&#xAD;duc&#xAD;tivity so we should ex&#xAD;pect that re&#xAD;duc&#xAD;ing loneli&#xAD;ness would sig&#xAD;nifi&#xAD;cantly in&#xAD;crease so&#xAD;ciety’s flour&#xAD;ish&#xAD;ing.&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Pos&#xAD;si&#xAD;ble in&#xAD;ter&#xAD;ven&#xAD;tions:&lt;ol&gt;&lt;li&gt;&lt;p&gt;Devel&#xAD;op&#xAD;ing on&#xAD;line plat&#xAD;forms that al&#xAD;low in&#xAD;di&#xAD;vi&#xAD;d&#xAD;u&#xAD;als to host in-per&#xAD;son com&#xAD;mu&#xAD;nity events for free.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Creat&#xAD;ing more spaces where it is nor&#xAD;mal and en&#xAD;couraged to in&#xAD;ter&#xAD;act with peo&#xAD;ple that you do not already know.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Creat&#xAD;ing dat&#xAD;ing and friend&#xAD;ship apps that use ma&#xAD;chine learn&#xAD;ing to cre&#xAD;ate op&#xAD;ti&#xAD;mal pairings be&#xAD;tween in&#xAD;di&#xAD;vi&#xAD;d&#xAD;u&#xAD;als for re&#xAD;la&#xAD;tion&#xAD;ship satis&#xAD;fac&#xAD;tion.&lt;ol&gt;&lt;li&gt;&lt;p&gt;Th&#xAD;ese could pos&#xAD;si&#xAD;bly be de&#xAD;signed such that, af&#xAD;ter sign&#xAD;ing up, the app rarely re&#xAD;quires user in&#xAD;ter&#xAD;ac&#xAD;tion out&#xAD;side of ar&#xAD;rang&#xAD;ing meet&#xAD;ing up with peo&#xAD;ple that you have met on the app.&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;The mass spread of misinformation&lt;ol&gt;&lt;li&gt;&lt;p&gt;Ex&#xAD;pla&#xAD;na&#xAD;tion:&lt;ol&gt;&lt;li&gt;&lt;p&gt;The mod&#xAD;ern in&#xAD;ter&#xAD;net and its reg&#xAD;u&#xAD;la&#xAD;tions have cre&#xAD;ated an en&#xAD;vi&#xAD;ron&#xAD;ment where billions of peo&#xAD;ple are ex&#xAD;posed to mis&#xAD;in&#xAD;for&#xAD;ma&#xAD;tion on a reg&#xAD;u&#xAD;lar ba&#xAD;sis. This is, in my view, cre&#xAD;at&#xAD;ing more po&#xAD;larized so&#xAD;cieties, harm&#xAD;ing democ&#xAD;racy as an in&#xAD;sti&#xAD;tu&#xAD;tion, nega&#xAD;tively im&#xAD;pact&#xAD;ing pub&#xAD;lic health, and re&#xAD;duc&#xAD;ing so&#xAD;cial con&#xAD;nec&#xAD;tion.&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Pos&#xAD;si&#xAD;ble in&#xAD;ter&#xAD;ven&#xAD;tions:&lt;ol&gt;&lt;li&gt;&lt;p&gt;Reg&#xAD;u&#xAD;lat&#xAD;ing and break&#xAD;ing up on&#xAD;line plat&#xAD;forms so that con&#xAD;sumers have a greater range of sources from which to choose to get in&#xAD;for&#xAD;ma&#xAD;tion.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Re&#xAD;quiring broad&#xAD;cast&#xAD;ers to de&#xAD;mar&#xAD;cate opinion pieces from re&#xAD;port&#xAD;ing on ra&#xAD;dio and na&#xAD;tional television&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;The ex&#xAD;is&#xAD;tence of to&#xAD;tal&#xAD;i&#xAD;tar&#xAD;ian states&lt;ol&gt;&lt;li&gt;&lt;p&gt;Ex&#xAD;pla&#xAD;na&#xAD;tion:&lt;ol&gt;&lt;li&gt;&lt;p&gt;Some states across the world do not en&#xAD;gage in demo&#xAD;cratic prac&#xAD;tices, con&#xAD;trol the in&#xAD;for&#xAD;ma&#xAD;tion their peo&#xAD;ple have ac&#xAD;cess to, and pre&#xAD;vent their peo&#xAD;ple from trav&#xAD;el&#xAD;ing to cer&#xAD;tain places. I be&#xAD;lieve these states rep&#xAD;re&#xAD;sent an ex&#xAD;is&#xAD;ten&#xAD;tial threat to hu&#xAD;man&#xAD;ity be&#xAD;cause they lack self-reg&#xAD;u&#xAD;lat&#xAD;ing to pre&#xAD;vent the state from op&#xAD;press&#xAD;ing its peo&#xAD;ple, and, if these states grow large enough, they could con&#xAD;trol hu&#xAD;man&#xAD;ity’s fate. Work&#xAD;ing to end the ex&#xAD;is&#xAD;tence of such states should be a top pri&#xAD;or&#xAD;ity for humanity&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;In&#xAD;ter&#xAD;ven&#xAD;tions:&lt;ol&gt;&lt;li&gt;&lt;p&gt;Pro&#xAD;mot&#xAD;ing more ag&#xAD;gres&#xAD;sive na&#xAD;tional and in&#xAD;ter&#xAD;na&#xAD;tional poli&#xAD;cies for end&#xAD;ing to&#xAD;tal&#xAD;i&#xAD;tar&#xAD;ian regimes&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;The lack of reg&#xAD;u&#xAD;la&#xAD;tion on Amer&#xAD;i&#xAD;can tech companies&lt;ol&gt;&lt;li&gt;&lt;p&gt;Ex&#xAD;pla&#xAD;na&#xAD;tion:&lt;ol&gt;&lt;li&gt;&lt;p&gt;Amer&#xAD;i&#xAD;can tech com&#xAD;pa&#xAD;nies have an ex&#xAD;traor&#xAD;di&#xAD;nary will&#xAD;ing&#xAD;ness to sac&#xAD;ri&#xAD;fice gen&#xAD;eral welfare and the sta&#xAD;bil&#xAD;ity of their own na&#xAD;tion in or&#xAD;der to max&#xAD;i&#xAD;mize profit. So&#xAD;cial me&#xAD;dia com&#xAD;pa&#xAD;nies, for in&#xAD;stance, have de&#xAD;vel&#xAD;oped al&#xAD;gorithms that max&#xAD;i&#xAD;mize the time users spend on their plat&#xAD;forms rather than the satis&#xAD;fac&#xAD;tion they get out of them. Similarly, dat&#xAD;ing app com&#xAD;pa&#xAD;nies have de-nor&#xAD;mal&#xAD;ized tra&#xAD;di&#xAD;tional forms of courtship and then put a pay&#xAD;wall be&#xAD;hind their new form of courtship—no&#xAD;tably, one that is harm&#xAD;ful to peo&#xAD;ple’s men&#xAD;tal health.&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;In&#xAD;ter&#xAD;ven&#xAD;tions:&lt;ol&gt;&lt;li&gt;&lt;p&gt;Trust-bus&#xAD;ing Amer&#xAD;i&#xAD;can tech companies&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Pro&#xAD;mot&#xAD;ing fur&#xAD;ther reg&#xAD;u&#xAD;la&#xAD;tion of Amer&#xAD;i&#xAD;can tech companies&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;</description>
            <author>James Brobin</author>
            <guid>KqZqsanrB4oB5GC3q</guid>
            <pubDate>Thu, 09 Apr 2026 23:21:08 +0000</pubDate>
        </item>
        <item>
            <title>Anyone vibe-coding a directory of “profit for good” companies/​EA-aligned service providers (scraper/​bot)?  by david_reinstein</title>
            <link>https://forum.nunosempere.com/posts/dHrCuDf3TsCmwP4j8/anyone-vibe-coding-a-directory-of-profit-for-good-companies</link>
            <description>&lt;p&gt;&lt;i&gt;Up&#xAD;date: &lt;/i&gt;&lt;a href=&quot;https://impact-products-directory.netlify.app/&quot;&gt;&lt;i&gt;here’s a first pass&lt;/i&gt;&lt;/a&gt; (about 90 min&#xAD;utes of hu&#xAD;man brain time, lev&#xAD;er&#xAD;ag&#xAD;ing my ear&#xAD;lier work)&lt;br&gt;&lt;br&gt;Mainly for fun, I vibe-coded &lt;a href=&quot;https://masslibraryofthings.netlify.app/&quot;&gt;this “library of things”&lt;/a&gt; scraper bot/​page cat&#xAD;a&#xAD;logu&#xAD;ing what non-me&#xAD;dia things all the libraries in the US loan out.  &lt;/p&gt;&lt;p&gt;It just oc&#xAD;curred to me that do&#xAD;ing this for  com&#xAD;pa&#xAD;nies that donate a share of prof&#xAD;its to (effec&#xAD;tive) char&#xAD;i&#xAD;ties, or ser&#xAD;vice providers that are GWWC pledgers would be doable and some&#xAD;what im&#xAD;pact&#xAD;ful. En&#xAD;com&#xAD;pass&#xAD;ing &lt;a href=&quot;/users/brad-west?mention=user&quot;&gt;@Brad West🔸&lt;/a&gt; ’s &lt;a href=&quot;https://profit4good.org/&quot; class=&quot;bare-url&quot;&gt;https://​​profit4good.org/​​&lt;/a&gt; , &lt;a href=&quot;https://www.givingwhatwecan.org/about-us/members#company-members-section&quot;&gt;GWWC com&#xAD;pany pledgers&lt;/a&gt; and be&#xAD;yond, with some filters and la&#xAD;bels, and po&#xAD;ten&#xAD;tial hu&#xAD;man ver&#xAD;ifi&#xAD;ca&#xAD;tion. This might also at&#xAD;tract (~non-EA) peo&#xAD;ple who are try&#xAD;ing to avoid buy&#xAD;ing on Ama&#xAD;zon for eth&#xAD;i&#xAD;cal/​poli&#xAD;ti&#xAD;cal rea&#xAD;sons. &lt;br&gt;&lt;br&gt;But be&#xAD;fore I dig in I was won&#xAD;der&#xAD;ing :&lt;/p&gt;&lt;p&gt;1. Is any&#xAD;one already do&#xAD;ing this? &lt;/p&gt;&lt;p&gt;2. Any&#xAD;one want to help? &lt;/p&gt;&lt;p&gt;3. Pit&#xAD;falls I should be aware of? &lt;br&gt;&lt;br&gt;Fur&#xAD;ther dis&#xAD;cus&#xAD;sion in &lt;a href=&quot;https://chatgpt.com/share/69d812f3-b7cc-832a-85e6-fb8afcfc540b&quot;&gt;this ChatGPT 5.4 ex&#xAD;change.&lt;/a&gt;  (NB I’d usu&#xAD;ally use a pro model for this sort of plan&#xAD;ning, but this is just a first pass)&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;i&gt;Up&#xAD;dates;  &lt;/i&gt;Liz K re&#xAD;minded me of &lt;a href=&quot;https://ea-services.org/&quot; class=&quot;bare-url&quot;&gt;https://​​ea-ser&#xAD;vices.org/​​&lt;/a&gt; (EASE) --  &lt;a href=&quot;/posts/mb4kzhfRnpQNtF6ut/introducing-ease-a-managed-directory-of-ea-organization&quot;&gt;In&#xAD;tro&#xAD;duc&#xAD;ing EASE, a man&#xAD;aged di&#xAD;rec&#xAD;tory of EA Or&#xAD;ga&#xAD;ni&#xAD;za&#xAD;tion Ser&#xAD;vice Providers&lt;/a&gt; , which I’ll try to fold in as well. &lt;br&gt;&lt;br&gt;Other posts for back&#xAD;link&#xAD;ing:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;/posts/nocZECBmPJGcSS7yx/the-giving-store-100-profits-to-givedirectly&quot;&gt;The Giv&#xAD;ing Store- 100% Profits to GiveDirectly&lt;/a&gt;  &lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;/posts/m8p4ZYqZZvN96w5q8/profit-for-good-an-faq-responsive-to-ea-feedback&quot;&gt;Profit for Good- an FAQ Re&#xAD;spon&#xAD;sive to EA Feedback&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;/posts/BSYvy79mBZoGtgAW7/companies-pledge-to-donate-at-least-10-of-profits-to&quot;&gt;Com&#xAD;pa&#xAD;nies pledge to donate at least 10% of prof&#xAD;its to effec&#xAD;tive char&#xAD;i&#xAD;ties&lt;/a&gt; &lt;br&gt;&lt;br&gt;&lt;br&gt; &lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;</description>
            <author>david_reinstein</author>
            <guid>dHrCuDf3TsCmwP4j8</guid>
            <pubDate>Thu, 09 Apr 2026 21:13:10 +0000</pubDate>
        </item>
        <item>
            <title>Help me launch Obsolete: a book aimed at building a new movement for AI reform by Garrison</title>
            <link>https://forum.nunosempere.com/posts/ekShYQpaScoZLqQyi/help-me-launch-obsolete-a-book-aimed-at-building-a-new</link>
            <description>&lt;p&gt;I wrote a book! It’s called &lt;i&gt;Ob&#xAD;so&#xAD;lete: The AI In&#xAD;dus&#xAD;try’s Trillion-Dol&#xAD;lar Race to Re&#xAD;place You—and How to Stop It&lt;/i&gt;, and it’ll be available in May if you &lt;a href=&quot;https://orbooks.com/catalog/obsolete/?utm_source=eaforum&amp;amp;utm_medium=post&amp;amp;utm_campaign=obsolete-preorder&quot;&gt;pre&#xAD;order&lt;/a&gt; through my pub&#xAD;lisher (OR Books+The &lt;i&gt;Na&#xAD;tion&lt;/i&gt;).&lt;/p&gt;&lt;p&gt;With it, I took on a dual mis&#xAD;sion: get any&#xAD;one up to speed on the state of AI and its risks (es&#xAD;pe&#xAD;cially skep&#xAD;tics and lefties) AND write some&#xAD;thing peo&#xAD;ple work&#xAD;ing in the space will en&#xAD;joy and learn from. An un&#xAD;bi&#xAD;ased panel of early re&#xAD;view&#xAD;ers (i.e. peo&#xAD;ple I know) think it suc&#xAD;ceeds. For in&#xAD;stance, here’s a re&#xAD;view from a skep&#xAD;ti&#xAD;cal friend: &lt;/p&gt;&lt;blockquote&gt;&lt;p&gt;It felt like I was be&#xAD;ing drawn into this profound vi&#xAD;sion I had man&#xAD;aged to keep out of my head for years… I feel this deep, visceral dis&#xAD;trust of AI com&#xAD;pa&#xAD;nies, and chat&#xAD;bots, for rea&#xAD;sons that you were able to lay out in var&#xAD;i&#xAD;ous places. I think I came to this like a very non-SF, non-tech, ‘normie’ who is re&#xAD;act&#xAD;ing with their gut to see&#xAD;ing the pro&#xAD;lifer&#xAD;a&#xAD;tion of ai… But I also had never di&#xAD;rectly faced the story of AI, how the tech worked, and whether my skep&#xAD;ti&#xAD;cism was in fact, ill-founded. This hit that, and I thought it did a great job bal&#xAD;anc&#xAD;ing the lines with&#xAD;out get&#xAD;ting too drawn into the hype.&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;If you want to help the book, the most use&#xAD;ful thing you can do right now is &lt;a href=&quot;https://orbooks.com/catalog/obsolete/&quot;&gt;pre&#xAD;order&lt;/a&gt; and en&#xAD;courage oth&#xAD;ers to do the same. &lt;a href=&quot;https://orbooks.com/catalog/obsolete/&quot;&gt;pre&#xAD;order&#xAD;ing&lt;/a&gt; through my pub&#xAD;lisher gets you the book in May (weeks be&#xAD;fore wide re&#xAD;lease June 23) and shows there’s an au&#xAD;di&#xAD;ence, which mat&#xAD;ters as we try to get it in front of more peo&#xAD;ple. It also gets you 15% off!  &lt;/p&gt;&lt;p&gt;And if you’d like to share it on so&#xAD;cials: &lt;a href=&quot;https://x.com/GarrisonLovely/status/2042317347084779644&quot;&gt;Twit&#xAD;ter&lt;/a&gt;, &lt;a href=&quot;https://bsky.app/profile/garrisonlovely.bsky.social/post/3mj3iozrwcf2g&quot;&gt;Bluesky&lt;/a&gt;, &lt;a href=&quot;https://www.linkedin.com/feed/update/urn:li:share:7448083316911775747/&quot;&gt;LinkedIn&lt;/a&gt;, &lt;a href=&quot;https://www.threads.com/@glovely27/post/DW7ARWNAYSs&quot;&gt;Threads&lt;/a&gt;. If you have a pod&#xAD;cast, newslet&#xAD;ter, pub&#xAD;li&#xAD;ca&#xAD;tion, or other plat&#xAD;form where you might cover the book, email press@&lt;a href=&quot;http://obsoletebook.org&quot; class=&quot;bare-url&quot;&gt;ob&#xAD;so&#xAD;lete&#xAD;book.org&lt;/a&gt; and in&#xAD;clude a bit about your work to re&#xAD;quest a re&#xAD;view copy. &lt;/p&gt;&lt;p&gt;In a nut&#xAD;shell, the book is about the largest pro&#xAD;ject in his&#xAD;tory: the at&#xAD;tempt to ren&#xAD;der you ob&#xAD;so&#xAD;lete. The thing I wres&#xAD;tled with the most was: &lt;i&gt;what the hell are we sup&#xAD;posed to do about it?&lt;/i&gt; When you lay out all the po&#xAD;ten&#xAD;tial hor&#xAD;rors pre&#xAD;sented by the ad&#xAD;vent of ma&#xAD;chines that can fully re&#xAD;place hu&#xAD;man la&#xAD;bor, there is re&#xAD;ally only one an&#xAD;swer that ad&#xAD;e&#xAD;quately avoids them: stop. This, how&#xAD;ever, is anath&#xAD;ema to the peo&#xAD;ple that need to be con&#xAD;vinced, so I, like many oth&#xAD;ers, tried to find pro&#xAD;pos&#xAD;als that would help im&#xAD;prove our chances that things go well that wouldn’t get me writ&#xAD;ten off as naive (e.g. “Build a pause but&#xAD;ton”). &lt;/p&gt;&lt;p&gt;But in an effort to be taken se&#xAD;ri&#xAD;ously by Very Se&#xAD;ri&#xAD;ous Peo&#xAD;ple, I wasn’t ac&#xAD;tu&#xAD;ally be&#xAD;ing se&#xAD;ri&#xAD;ous my&#xAD;self. And the peo&#xAD;ple in charge aren’t go&#xAD;ing to do what’s needed un&#xAD;less enough of us de&#xAD;mand it of them. &lt;/p&gt;&lt;p&gt;Of course, think&#xAD;ing about what I re&#xAD;fer to in the book as the “Ob&#xAD;so&#xAD;let&#xAD;ing Pro&#xAD;ject” is a tricky busi&#xAD;ness. The best rea&#xAD;son to be&#xAD;lieve it might suc&#xAD;ceed is also the best rea&#xAD;son to think our ob&#xAD;so&#xAD;les&#xAD;cence is in&#xAD;evitable: the un&#xAD;prece&#xAD;dented scale of the re&#xAD;sources pour&#xAD;ing into it. Could it re&#xAD;ally be stopped? The first per&#xAD;son I needed to con&#xAD;vince was my&#xAD;self. Re&#xAD;search&#xAD;ing the book, es&#xAD;pe&#xAD;cially how the Nu&#xAD;clear Freeze move&#xAD;ment got Rea&#xAD;gan to about-face on arms con&#xAD;trol (cov&#xAD;ered in the fi&#xAD;nal chap&#xAD;ter), paired with watch&#xAD;ing con&#xAD;cern about AI ex&#xAD;plode into the pub&#xAD;lic con&#xAD;scious&#xAD;ness, con&#xAD;vinced me that stop&#xAD;ping is am&#xAD;bi&#xAD;tious, but em&#xAD;i&#xAD;nently achiev&#xAD;able.&lt;/p&gt;&lt;p&gt;&lt;i&gt;Ob&#xAD;so&#xAD;lete &lt;/i&gt;draws on hun&#xAD;dreds of con&#xAD;ver&#xAD;sa&#xAD;tions with lead&#xAD;ing AI re&#xAD;searchers, ad&#xAD;vo&#xAD;cates, and in&#xAD;dus&#xAD;try in&#xAD;sid&#xAD;ers (in&#xAD;clud&#xAD;ing Ge&#xAD;offrey Hin&#xAD;ton, Yoshua Ben&#xAD;gio, Daron Ace&#xAD;moğlu, Eliezer Yud&#xAD;kowsky, Shane Legg, He&#xAD;len Toner, Deb&#xAD;o&#xAD;rah Raji, Stu&#xAD;art Rus&#xAD;sell, Ajeya Co&#xAD;tra, Daniel Koko&#xAD;ta&#xAD;jlo, and Dan Hendrycks). &lt;/p&gt;&lt;p&gt;A lot has changed in the last few years, and it was painful to largely sit out The Dis&#xAD;course the past six months as both I, and seem&#xAD;ingly the world, en&#xAD;tered crunch time. But it would have been far more painful to not write the book at all, be&#xAD;cause it’s be&#xAD;come in&#xAD;creas&#xAD;ingly clear that there’s an &lt;i&gt;Ob&#xAD;so&#xAD;lete&lt;/i&gt;-shaped hole in the dis&#xAD;course. &lt;/p&gt;&lt;p&gt;The book cov&#xAD;ers core con&#xAD;cepts from AI safety, but also where they fall short, while propos&#xAD;ing en&#xAD;tirely new ways to think about it. Here are some of the ideas I’m most ex&#xAD;cited about:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;strong&gt;The Ob&#xAD;so&#xAD;let&#xAD;ing Pro&#xAD;ject&lt;/strong&gt; (more on this in the ex&#xAD;cerpt be&#xAD;low)&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;strong&gt;AI re&#xAD;form&lt;/strong&gt;—The tech&#xAD;nol&#xAD;ogy’s biggest doom&#xAD;say&#xAD;ers and its most fer&#xAD;vent crit&#xAD;ics are locked in an ar&#xAD;gu&#xAD;ment over whether to&#xAD;day’s AI is a wildly use&#xAD;ful tool or a gen&#xAD;er&#xAD;a&#xAD;tional scam. I con&#xAD;tend that deep learn&#xAD;ing is in&#xAD;cred&#xAD;ible, but it’s be&#xAD;ing pointed at the wrong things by the wrong peo&#xAD;ple for the wrong rea&#xAD;sons. What’s needed is a new per&#xAD;spec&#xAD;tive on the tech&#xAD;nol&#xAD;ogy, what I call AI re&#xAD;form, that breaks from AI safety’s nar&#xAD;row fo&#xAD;cus on tech&#xAD;ni&#xAD;cal solu&#xAD;tions and the skep&#xAD;tics who re&#xAD;flex&#xAD;ively dis&#xAD;miss the tech&#xAD;nol&#xAD;ogy’s po&#xAD;ten&#xAD;tial.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;strong&gt;The Align&#xAD;ment Poly&#xAD;crisis&lt;/strong&gt;—The clas&#xAD;sic al&#xAD;ign&#xAD;ment prob&#xAD;lem just cov&#xAD;ers tech&#xAD;ni&#xAD;cal al&#xAD;ign&#xAD;ment—get&#xAD;ting ma&#xAD;chines to do what you want—and (some&#xAD;times) nor&#xAD;ma&#xAD;tive al&#xAD;ign&#xAD;ment—want&#xAD;ing the right things. But this fram&#xAD;ing misses huge parts of the challenge. To widen our aper&#xAD;ture, I in&#xAD;tro&#xAD;duce the Align&#xAD;ment Poly&#xAD;crisis. Tech&#xAD;ni&#xAD;cal and nor&#xAD;ma&#xAD;tive al&#xAD;ign&#xAD;ment are just the first two lay&#xAD;ers of the Poly&#xAD;crisis. The third layer is eco&#xAD;nomic: we should ex&#xAD;pect un&#xAD;reg&#xAD;u&#xAD;lated com&#xAD;peti&#xAD;tors to do what&#xAD;ever amount of al&#xAD;ign&#xAD;ment helps them com&#xAD;pete best, as they sprint to ren&#xAD;der hu&#xAD;man&#xAD;ity ob&#xAD;so&#xAD;lete. The fourth is geopoli&#xAD;ti&#xAD;cal: the specter of for&#xAD;eign ad&#xAD;ver&#xAD;saries, of&#xAD;ten cyn&#xAD;i&#xAD;cally in&#xAD;voked, drives a self-fulfilling race to the bot&#xAD;tom. Fi&#xAD;nally, a “solu&#xAD;tion” to the al&#xAD;ign&#xAD;ment prob&#xAD;lem is nei&#xAD;ther nec&#xAD;es&#xAD;sary nor suffi&#xAD;cient for solv&#xAD;ing the prob&#xAD;lems posed by the Ob&#xAD;so&#xAD;let&#xAD;ing Pro&#xAD;ject.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;strong&gt;The Class of 2034&lt;/strong&gt; - To&#xAD;day’s AI rev&#xAD;olu&#xAD;tion has been pow&#xAD;ered by deep learn&#xAD;ing, which re&#xAD;ally started work&#xAD;ing in 2012. If the deep learn&#xAD;ing era were a per&#xAD;son, it would only be four&#xAD;teen. At six, it could &lt;a href=&quot;https://progress.openai.com/&quot;&gt;barely&lt;/a&gt; string a sen&#xAD;tence to&#xAD;gether. By twelve, it was writ&#xAD;ing pass&#xAD;able code. Now it’s &lt;a href=&quot;https://cs.stanford.edu/~knuth/papers/claude-cycles.pdf&quot;&gt;solv&#xAD;ing&lt;/a&gt; prob&#xAD;lems that stumped the “&lt;a href=&quot;https://www.nytimes.com/2018/12/17/science/donald-knuth-computers-algorithms-programming.html#:~:text=He%20is%20the%20author%20of,can%20read%20the%20whole%20thing.%E2%80%9D&quot;&gt;mas&#xAD;ter of al&#xAD;gorithms&lt;/a&gt;” and is &lt;a href=&quot;https://x.com/rakyll/status/2007239758158975130&quot;&gt;do&#xAD;ing&lt;/a&gt; year-long en&#xAD;g&#xAD;ineer&#xAD;ing pro&#xAD;jects in an hour. In 2034, when kids born in 2012 will be grad&#xAD;u&#xAD;at&#xAD;ing col&#xAD;lege, what jobs will be left for them to do?&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;My wildest dream is that &lt;i&gt;Ob&#xAD;so&#xAD;lete&lt;/i&gt; can do for AI what &lt;i&gt;Silent Spring&lt;/i&gt; did for en&#xAD;vi&#xAD;ron&#xAD;men&#xAD;tal&#xAD;ism or &lt;i&gt;Un&#xAD;safe at Any Speed&lt;/i&gt; did for con&#xAD;sumer safety. The book is writ&#xAD;ten, the timing is perfect, and the big ques&#xAD;tion-mark is whether it makes a big enough splash. I’m do&#xAD;ing ev&#xAD;ery&#xAD;thing I can to make that hap&#xAD;pen, but could use all the help I can get&lt;/p&gt;&lt;h3&gt;Ways to help&lt;/h3&gt;&lt;p&gt;&lt;a href=&quot;https://orbooks.com/catalog/obsolete/?utm_source=eaforum&amp;amp;utm_medium=post&amp;amp;utm_campaign=obsolete-preorder&quot;&gt;&lt;strong&gt;Pre&#xAD;order it&lt;/strong&gt;&lt;/a&gt;!&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Share it. &lt;/strong&gt;Word-of-mouth from peo&#xAD;ple who already get it is the launch’s high&#xAD;est-lev&#xAD;er&#xAD;age chan&#xAD;nel.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Leave a Goodreads re&#xAD;view once you’ve read it. &lt;/strong&gt;Ama&#xAD;zon doesn’t al&#xAD;low re&#xAD;views be&#xAD;fore re&#xAD;lease, so save the Ama&#xAD;zon re&#xAD;view for June 23 or af&#xAD;ter—please leave one then too, even if you got the book through OR di&#xAD;rect (it’ll be marked “un&#xAD;ver&#xAD;ified” but still counts). Both mat&#xAD;ter: Goodreads helps with early word-of-mouth, Ama&#xAD;zon re&#xAD;views mat&#xAD;ter most in the first weeks af&#xAD;ter wide re&#xAD;lease.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Crit&#xAD;i&#xAD;cize the book in pub&#xAD;lic. &lt;/strong&gt;You’ll quickly find that Ob&#xAD;so&#xAD;lete is not or&#xAD;tho&#xAD;dox AI safety trans&#xAD;lated for a lay au&#xAD;di&#xAD;ence. One of the most ex&#xAD;cit&#xAD;ing things about get&#xAD;ting a lot of at&#xAD;ten&#xAD;tion on the book is get&#xAD;ting a chance to red team these ideas and im&#xAD;prove upon them. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Bulk or&#xAD;ders for groups.&lt;/strong&gt; If you or&#xAD;ga&#xAD;nize a read&#xAD;ing group, uni&#xAD;ver&#xAD;sity group, work&#xAD;place book club, or want to gift copies, my pub&#xAD;lisher OR Books offers tiered dis&#xAD;counts (20% off 5-49, 30% off 50-99, 40% off 100+). Email bulk@&lt;a href=&quot;http://obsoletebook.org&quot; class=&quot;bare-url&quot;&gt;ob&#xAD;so&#xAD;lete&#xAD;book.org&lt;/a&gt;. Note: bulk or&#xAD;ders go great for dis&#xAD;tri&#xAD;bu&#xAD;tion but gen&#xAD;er&#xAD;ally don’t count to&#xAD;ward best&#xAD;sel&#xAD;ler rank&#xAD;ings, so the high&#xAD;est-lev&#xAD;er&#xAD;age thing for the launch is still in&#xAD;di&#xAD;vi&#xAD;d&#xAD;ual pre&#xAD;orders.&lt;/p&gt;&lt;p&gt;If you want to do more—host a read&#xAD;ing group, help with cov&#xAD;er&#xAD;age, or&#xAD;ga&#xAD;nize an event, write a re&#xAD;view—there’s a form on the book site that lets me triage offers and fol&#xAD;low up prop&#xAD;erly: &lt;a href=&quot;https://www.obsoletebook.org/#help&quot; class=&quot;bare-url&quot;&gt;https://​​www.ob&#xAD;so&#xAD;lete&#xAD;book.org/​​#help&lt;/a&gt;&lt;/p&gt;&lt;p&gt;If you have a spe&#xAD;cific high-lev&#xAD;er&#xAD;age in&#xAD;tro (a ma&#xAD;jor pod&#xAD;cast host, a mem&#xAD;ber of Congress, a move&#xAD;ment or&#xAD;ga&#xAD;nizer do&#xAD;ing se&#xAD;ri&#xAD;ous work), email me here: press@&lt;a href=&quot;http://obsoletebook.org&quot; class=&quot;bare-url&quot;&gt;ob&#xAD;so&#xAD;lete&#xAD;book.org&lt;/a&gt;&lt;/p&gt;&lt;h3&gt;Table of contents&lt;/h3&gt;&lt;p&gt;&lt;i&gt;Part 1 - What You’ve Heard&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;Prologue—The Ob&#xAD;so&#xAD;let&#xAD;ing Project&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;Chap&#xAD;ter 1 - What is Ar&#xAD;tifi&#xAD;cial In&#xAD;tel&#xAD;li&#xAD;gence?&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;Chap&#xAD;ter 2 - Is AI a Bub&#xAD;ble?&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;Chap&#xAD;ter 3 - Will the Robots Take Our Jobs?&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;Chap&#xAD;ter 4 - “Just Un&#xAD;plug It”&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;Chap&#xAD;ter 5 - “In&#xAD;creas&#xAD;ingly Smart So&#xAD;ciopaths”&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;Chap&#xAD;ter 6 - Why Build the Doom Ma&#xAD;chine?&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;Chap&#xAD;ter 7 - Is AI Im&#xAD;pos&#xAD;si&#xAD;ble?&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;Chap&#xAD;ter 8 - The “Doomers” Feel Misunderstood&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;Chap&#xAD;ter 9 - Is AI Inevitable?&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;In&#xAD;ter&#xAD;lude—Chap&#xAD;ter 10 - Mov&#xAD;ing Fast and Break&#xAD;ing Minds&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;Part 2 - How It Could Go&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;Chap&#xAD;ter 11 - The Wrong Kind of AI&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;Chap&#xAD;ter 12 - The Prob&#xAD;lems with the Align&#xAD;ment Problem&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;Chap&#xAD;ter 13 - How Fast Could It Hap&#xAD;pen?&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;Chap&#xAD;ter 14 - The Best-Case Scenario&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;Chap&#xAD;ter 15 - How We Could Lose&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;Part 3 - What to Do&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;Chap&#xAD;ter 16 - Stop&#xAD;ping the Ob&#xAD;so&#xAD;let&#xAD;ing Project&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;Chap&#xAD;ter 17 - Ac&#xAD;tu&#xAD;ally Demo&#xAD;cratic AI&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;Chap&#xAD;ter 18 - What Should the Rest of Us Do About It?&lt;/i&gt;&lt;/p&gt;&lt;h3&gt;Ex&#xAD;cerpt—The Ob&#xAD;so&#xAD;let&#xAD;ing Project&lt;/h3&gt;&lt;p&gt;A tiny group of peo&#xAD;ple, backed by the best-re&#xAD;sourced or&#xAD;ga&#xAD;ni&#xAD;za&#xAD;tions in the world, is at&#xAD;tempt&#xAD;ing to &lt;i&gt;ren&#xAD;der hu&#xAD;man&#xAD;ity ob&#xAD;so&#xAD;lete&lt;/i&gt;. Not ev&#xAD;ery par&#xAD;ti&#xAD;ci&#xAD;pant in the pro&#xAD;ject even be&#xAD;lieves this is pos&#xAD;si&#xAD;ble, but the effort’s van&#xAD;guard has defined it&#xAD;self through its faith in the trans&#xAD;for&#xAD;ma&#xAD;tive po&#xAD;ten&#xAD;tial of ma&#xAD;chine in&#xAD;tel&#xAD;li&#xAD;gence and its will&#xAD;ing&#xAD;ness to forge ahead de&#xAD;spite the grave risks it so rou&#xAD;tinely ac&#xAD;knowl&#xAD;edges.&lt;/p&gt;&lt;p&gt;The in&#xAD;dus&#xAD;try’s ul&#xAD;ti&#xAD;mate goal and the most con&#xAD;tro&#xAD;ver&#xAD;sial term in this space—ar&#xAD;tifi&#xAD;cial gen&#xAD;eral in&#xAD;tel&#xAD;li&#xAD;gence—con&#xAD;jures up images of a new type of mind that might think like our own. But it would not. And “AGI” keeps the fo&#xAD;cus on what it might &lt;i&gt;be&lt;/i&gt;, when the at&#xAD;ten&#xAD;tion should re&#xAD;ally be on what it might &lt;i&gt;do&lt;/i&gt;. It would be bet&#xAD;ter to un&#xAD;der&#xAD;stand what com&#xAD;pa&#xAD;nies like OpenAI, An&#xAD;thropic, and Google are try&#xAD;ing to build not as a new type of brain, but rather a new type of ma&#xAD;chine, one that doesn’t make prod&#xAD;ucts or ser&#xAD;vices, but pro&#xAD;duces &lt;i&gt;la&#xAD;bor it&#xAD;self&lt;/i&gt;.&lt;/p&gt;&lt;p&gt;Why? Hu&#xAD;mans—and hu&#xAD;man-de&#xAD;rived la&#xAD;bor—are messy. Hu&#xAD;man&#xAD;ity has always been con&#xAD;strained be&#xAD;cause hu&#xAD;mans are the con&#xAD;straint. It takes a long time to make more of us, to ed&#xAD;u&#xAD;cate us, to feed and shelter us. We can only do so much. For cen&#xAD;turies, au&#xAD;toma&#xAD;tion has al&#xAD;lowed own&#xAD;ers to sub&#xAD;sti&#xAD;tute some cap&#xAD;i&#xAD;tal for la&#xAD;bor—the trac&#xAD;tor for the scythe, the loom for the nee&#xAD;dle—but only to a point.&lt;/p&gt;&lt;p&gt;Now, ma&#xAD;chine in&#xAD;tel&#xAD;li&#xAD;gence promises the po&#xAD;ten&#xAD;tial to cut out that last con&#xAD;straint: us. This obliter&#xAD;ates the one fun&#xAD;da&#xAD;men&#xAD;tal limit and opens up new wor&#xAD;lds of pos&#xAD;si&#xAD;bil&#xAD;ity. Many of them, as we will see, are ter&#xAD;rify&#xAD;ing.&lt;/p&gt;&lt;p&gt;The lead&#xAD;ers of what, in this book, I’ll call the &lt;i&gt;Ob&#xAD;so&#xAD;let&#xAD;ing Pro&#xAD;ject&lt;/i&gt; may con&#xAD;test this fram&#xAD;ing, offer&#xAD;ing some bro&#xAD;mides about how AI will aug&#xAD;ment hu&#xAD;man work or how we’ll sim&#xAD;ply find new jobs to do. But their stated goals are to cre&#xAD;ate sys&#xAD;tems that so clearly sur&#xAD;pass our abil&#xAD;ities that we will have lit&#xAD;tle hope of com&#xAD;pet&#xAD;ing. OpenAI makes this ex&#xAD;plicit, defin&#xAD;ing AGI as “a highly au&#xAD;tonomous sys&#xAD;tem that out&#xAD;performs hu&#xAD;mans at most eco&#xAD;nom&#xAD;i&#xAD;cally valuable work.”&lt;/p&gt;&lt;p&gt;Many are skep&#xAD;ti&#xAD;cal that the Ob&#xAD;so&#xAD;let&#xAD;ing Pro&#xAD;ject will suc&#xAD;ceed. Per&#xAD;haps you are too, and I don’t blame you. Often, seem&#xAD;ingly equally cre&#xAD;den&#xAD;tialed peo&#xAD;ple are mak&#xAD;ing op&#xAD;po&#xAD;site claims about the pre&#xAD;sent and fu&#xAD;ture of the tech&#xAD;nol&#xAD;ogy. But the Pro&#xAD;ject’s scale has no real prece&#xAD;dent. What we think of as big—Gilded Age mo&#xAD;nop&#xAD;o&#xAD;lies, the Man&#xAD;hat&#xAD;tan Pro&#xAD;ject, the Apollo pro&#xAD;gram—doesn’t even come close to cap&#xAD;tur&#xAD;ing its size. It’s worth look&#xAD;ing care&#xAD;fully at whether it has a chance of suc&#xAD;ceed&#xAD;ing.&lt;/p&gt;&lt;p&gt;As with all new tech&#xAD;nolo&#xAD;gies, the Ob&#xAD;so&#xAD;let&#xAD;ing Ma&#xAD;chine, if it works as in&#xAD;tended, will cre&#xAD;ate win&#xAD;ners and losers. If you’d like to dou&#xAD;ble down on to&#xAD;day’s biggest win&#xAD;ners—the ones de&#xAD;vel&#xAD;op&#xAD;ing it, the cap&#xAD;i&#xAD;tal-own&#xAD;ers, and the bosses, along with the na&#xAD;tions and for&#xAD;tunes large enough to buy a seat at the table—then &lt;i&gt;bring on the ac&#xAD;cel&#xAD;er&#xAD;a&#xAD;tion&lt;/i&gt;. The Ob&#xAD;so&#xAD;let&#xAD;ing Pro&#xAD;ject is, es&#xAD;sen&#xAD;tially, an in&#xAD;equal&#xAD;ity en&#xAD;g&#xAD;ine, as&#xAD;pira&#xAD;tionally turn&#xAD;ing un&#xAD;fath&#xAD;omable amounts of cap&#xAD;i&#xAD;tal into other peo&#xAD;ple’s un&#xAD;em&#xAD;ploy&#xAD;ment, all to max&#xAD;i&#xAD;mize value for share&#xAD;hold&#xAD;ers. The crown&#xAD;ing irony: the Pro&#xAD;ject was built on the backs of hu&#xAD;man&#xAD;ity’s col&#xAD;lec&#xAD;tive la&#xAD;bor, ap&#xAD;pro&#xAD;pri&#xAD;ated whole&#xAD;sale—our shared in&#xAD;her&#xAD;i&#xAD;tance plun&#xAD;dered to cre&#xAD;ate pri&#xAD;vate for&#xAD;tunes.&lt;/p&gt;&lt;p&gt;Un&#xAD;der&#xAD;stand&#xAD;ing these dy&#xAD;nam&#xAD;ics—who wins, who loses, and why—is es&#xAD;sen&#xAD;tial to grasp&#xAD;ing what’s ac&#xAD;tu&#xAD;ally hap&#xAD;pen&#xAD;ing right now, and what might hap&#xAD;pen next. Other books have been writ&#xAD;ten about ma&#xAD;chines that might one day think. This is the first book about ma&#xAD;chines that might one day pro&#xAD;duce la&#xAD;bor.&lt;/p&gt;</description>
            <author>Garrison</author>
            <guid>ekShYQpaScoZLqQyi</guid>
            <pubDate>Thu, 09 Apr 2026 19:17:01 +0000</pubDate>
        </item>
        <item>
            <title>China’s Major Meat Supplier Just Set a Cage-Free Policy—Why I’m More Optimistic Than Ever About China  by Whitney Peng</title>
            <link>https://forum.nunosempere.com/posts/7C2wrjXAQJdjQ4bs5/china-s-major-meat-supplier-just-set-a-cage-free-policy-why</link>
            <description>&lt;p&gt;&lt;i&gt;This post is an up&#xAD;date on some great and per&#xAD;haps sur&#xAD;pris&#xAD;ing progress in China that feels worth shar&#xAD;ing, and a sincere thank-you to the peo&#xAD;ple I have got&#xAD;ten to know through EAG SF (re&#xAD;cap at the end).&lt;/i&gt;&lt;/p&gt;&lt;h2&gt;Why China, and Why Now&lt;/h2&gt;&lt;p&gt;Ap&#xAD;prox&#xAD;i&#xAD;mately 75% of the world’s farmed an&#xAD;i&#xAD;mals are raised in Asia, yet the re&#xAD;gion re&#xAD;ceives a dis&#xAD;pro&#xAD;por&#xAD;tionately small frac&#xAD;tion of global an&#xAD;i&#xAD;mal welfare fund&#xAD;ing. Within that, China rep&#xAD;re&#xAD;sents an enor&#xAD;mous con&#xAD;cen&#xAD;tra&#xAD;tion of both suffer&#xAD;ing and—in&#xAD;creas&#xAD;ingly—tractable op&#xAD;por&#xAD;tu&#xAD;nity.&lt;/p&gt;&lt;p&gt;Se&#xAD;cur&#xAD;ing cor&#xAD;po&#xAD;rate com&#xAD;mit&#xAD;ments in China is gen&#xAD;uinely challeng&#xAD;ing. The reg&#xAD;u&#xAD;la&#xAD;tory en&#xAD;vi&#xAD;ron&#xAD;ment, busi&#xAD;ness cul&#xAD;ture, and the sheer scale of the sup&#xAD;ply chains in&#xAD;volved mean that this work re&#xAD;quires pa&#xAD;tience, re&#xAD;la&#xAD;tion&#xAD;ship-build&#xAD;ing, and care&#xAD;ful en&#xAD;gage&#xAD;ment. Lever China has been do&#xAD;ing that work for nearly seven years. The progress go&#xAD;ing on has not been pub&#xAD;li&#xAD;cized much out&#xAD;side of China, and that’s partly by de&#xAD;sign—the work of&#xAD;ten re&#xAD;quires dis&#xAD;cre&#xAD;tion to be effec&#xAD;tive.&lt;/p&gt;&lt;p&gt;But as both the pace of progress and the size of the op&#xAD;por&#xAD;tu&#xAD;nity have grown, we (and I) wanted to share more, as I think they col&#xAD;lec&#xAD;tively tell an im&#xAD;por&#xAD;tant story about the road ahead.&lt;/p&gt;&lt;hr&gt;&lt;h2&gt;A Notable Mile&#xAD;stone: Yu&#xAD;run Group’s Cage-Free Commitment&lt;/h2&gt;&lt;p&gt;A re&#xAD;cent com&#xAD;mit&#xAD;ment Lever China gen&#xAD;er&#xAD;ated that was &lt;a href=&quot;https://www.foodingredientsfirst.com/news/yurun-cage-free-supply-chain.html&quot;&gt;an&#xAD;nounced last month&lt;/a&gt; illus&#xAD;trates both the tractabil&#xAD;ity and scale of what’s pos&#xAD;si&#xAD;ble:  one of China’s largest meat sup&#xAD;pli&#xAD;ers com&#xAD;mit&#xAD;ted to sourc&#xAD;ing 100% cage-free eggs and chicken across all of its global fac&#xAD;to&#xAD;ries. &lt;/p&gt;&lt;p&gt;The com&#xAD;pany, Yu&#xAD;run Hold&#xAD;ings Group, has two pub&#xAD;li&#xAD;cly listed sub&#xAD;sidi&#xAD;aries and op&#xAD;er&#xAD;a&#xAD;tions span&#xAD;ning seven ma&#xAD;jor in&#xAD;dus&#xAD;tries. This isn’t a small food ser&#xAD;vice op&#xAD;er&#xAD;a&#xAD;tor mak&#xAD;ing a niche com&#xAD;mit&#xAD;ment—it’s a mas&#xAD;sive fresh, frozen, and pro&#xAD;cessed meat sup&#xAD;plier for both do&#xAD;mes&#xAD;tic sales and ex&#xAD;ports, and its pro&#xAD;cure&#xAD;ment de&#xAD;ci&#xAD;sions re&#xAD;ver&#xAD;ber&#xAD;ate across the coun&#xAD;try’s sup&#xAD;ply chain. When a com&#xAD;pany of this scale changes its sourc&#xAD;ing stan&#xAD;dards, sup&#xAD;pli&#xAD;ers have to adapt. Meat chick&#xAD;ens raised in cages is an im&#xAD;por&#xAD;tant welfare topic, largely in&#xAD;visi&#xAD;ble in global dis&#xAD;course. Again, &lt;u&gt;my ear&#xAD;lier post&lt;/u&gt; goes into more depth on this, but the short ver&#xAD;sion is:  broiler chick&#xAD;ens in China are fre&#xAD;quently kept in multi-tier cage sys&#xAD;tems that are of&#xAD;ten roughly the same size as bat&#xAD;tery cage sys&#xAD;tems for egg-lay&#xAD;ing hens.&lt;/p&gt;&lt;p&gt;Yu&#xAD;run’s com&#xAD;mit&#xAD;ment is part of a broader mo&#xAD;men&#xAD;tum that has been build&#xAD;ing over sev&#xAD;eral years. Lever China has se&#xAD;cured dozens of cage-free chicken and egg com&#xAD;mit&#xAD;ments from do&#xAD;mes&#xAD;tic Chi&#xAD;nese restau&#xAD;rant chains, re&#xAD;tailers, and food man&#xAD;u&#xAD;fac&#xAD;tur&#xAD;ers since it be&#xAD;gan this work, and cor&#xAD;po&#xAD;rate re&#xAD;cep&#xAD;tivity has been grow&#xAD;ing. Yu&#xAD;run is the largest name yet on a long and grow&#xAD;ing list of com&#xAD;mit&#xAD;ted com&#xAD;pa&#xAD;nies that Lever China’s team can con&#xAD;fi&#xAD;dently point to when en&#xAD;gag&#xAD;ing with new com&#xAD;pa&#xAD;nies on these is&#xAD;sues. &lt;/p&gt;&lt;hr&gt;&lt;h2&gt;Emerg&#xAD;ing Progress on Ducks (An Often Over&#xAD;looked Is&#xAD;sue)&lt;/h2&gt;&lt;p&gt;In par&#xAD;allel, we’re see&#xAD;ing early progress on ducks, an area that is also over&#xAD;looked and in which China is by far the world’s largest pro&#xAD;ducer and con&#xAD;sumer.&lt;/p&gt;&lt;p&gt;There’s an in&#xAD;ter&#xAD;est&#xAD;ing cul&#xAD;tural dy&#xAD;namic at play here. Duck holds a spe&#xAD;cial place in Chi&#xAD;nese cui&#xAD;sine—dishes like Pek&#xAD;ing roast duck carry cen&#xAD;turies of culi&#xAD;nary pres&#xAD;tige and are as&#xAD;so&#xAD;ci&#xAD;ated with qual&#xAD;ity and oc&#xAD;ca&#xAD;sion. Yet the liv&#xAD;ing con&#xAD;di&#xAD;tions of ducks are al&#xAD;most never dis&#xAD;cussed, and most peo&#xAD;ple (in&#xAD;clud&#xAD;ing many in the food in&#xAD;dus&#xAD;try) sim&#xAD;ply as&#xAD;sume that free-range poul&#xAD;try is the norm for pre&#xAD;mium dishes. The re&#xAD;al&#xAD;ity is that roughly half of ducks raised for meat and the large ma&#xAD;jor&#xAD;ity of egg-lay&#xAD;ing ducks are con&#xAD;fined in  cages, with a to&#xAD;tal of around 2 billion ducks per year in cage con&#xAD;fine&#xAD;ment. But that as&#xAD;sump&#xAD;tion cre&#xAD;ates an open&#xAD;ing:  when Lever China’s team raises the is&#xAD;sue, com&#xAD;pa&#xAD;nies are of&#xAD;ten gen&#xAD;uinely in&#xAD;ter&#xAD;ested to learn more.&lt;/p&gt;&lt;p&gt;Two of policy pledges that Lever China se&#xAD;cured in the past cou&#xAD;ple months illus&#xAD;trate this: &lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;strong&gt;Xiao Diao Li Tang&lt;/strong&gt;, a well-known tra&#xAD;di&#xAD;tional Beijing-style restau&#xAD;rant chain with around 60 out&#xAD;lets na&#xAD;tion&#xAD;wide, be&#xAD;came the first do&#xAD;mes&#xAD;tic restau&#xAD;rant chain in China to com&#xAD;mit to a full poul&#xAD;try welfare policy, cov&#xAD;er&#xAD;ing cage-free chicken meat, duck meat, chicken eggs, and duck eggs — all by 2030. This ex&#xAD;tends the usual chicken-and-egg scope to in&#xAD;clude duck in both meat and egg form, cov&#xAD;er&#xAD;ing the two key poul&#xAD;try in&#xAD;gre&#xAD;di&#xAD;ents cen&#xAD;tral to their menu.&lt;/p&gt;&lt;p&gt;Part of what makes this com&#xAD;mit&#xAD;ment in&#xAD;ter&#xAD;est&#xAD;ing is &lt;i&gt;why&lt;/i&gt; it hap&#xAD;pened. The owner was per&#xAD;son&#xAD;ally moved by the is&#xAD;sue—he hadn’t been aware of the scale of caged con&#xAD;fine&#xAD;ment for poul&#xAD;try un&#xAD;til the Lever team reached out. Once he un&#xAD;der&#xAD;stood the scale of im&#xAD;pact, he was mo&#xAD;ti&#xAD;vated to act. His view was sim&#xAD;ply that a com&#xAD;pany like his should do the right thing. That per&#xAD;sonal mo&#xAD;ti&#xAD;va&#xAD;tion, com&#xAD;bined with the fact that the pro&#xAD;cure&#xAD;ment team had already been sourc&#xAD;ing cage-free duck meat for one of its sig&#xAD;na&#xAD;ture dishes, made the for&#xAD;mal com&#xAD;mit&#xAD;ment a nat&#xAD;u&#xAD;ral step. &lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;strong&gt;Xuri Egg Prod&#xAD;ucts, &lt;/strong&gt;one of the largest duck egg pro&#xAD;duc&#xAD;ers and the largest duck egg ex&#xAD;porter in China. Ex&#xAD;ports a very large vol&#xAD;ume to Hong Kong, SE Asia, as well as to Europe and the US (ex. they are in Costco in the U.S.) In the fur&#xAD;ther past they used all cage-free meth&#xAD;ods, but over the last 10-15 years like the rest of the sec&#xAD;tor they have been mov&#xAD;ing heav&#xAD;ily to&#xAD;ward caged pro&#xAD;duc&#xAD;tion and are roughly &lt;span class=&quot;frac&quot;&gt;&lt;sup&gt;50&lt;/sup&gt;⁄&lt;sub&gt;50&lt;/sub&gt;&lt;/span&gt; to&#xAD;day. They com&#xAD;mit&#xAD;ted to make all of the duck eggs they ex&#xAD;port (which is about half of their to&#xAD;tal duck egg pro&#xAD;duc&#xAD;tion/​sales) cage-free by the end of next year. This is a defen&#xAD;sive win in that it likely doesn’t move ducks out of cages in the short term, but it pre&#xAD;vents the com&#xAD;pany from con&#xAD;vert&#xAD;ing an in&#xAD;creas&#xAD;ing amount of their ex&#xAD;ist&#xAD;ing pro&#xAD;duc&#xAD;tion over to caged sys&#xAD;tems, which is what would cer&#xAD;tainly have con&#xAD;tinued hap&#xAD;pen&#xAD;ing in the com&#xAD;ing years (since that is the path they’ve been on for the past decade), and it also helps en&#xAD;sure that nearly all fu&#xAD;ture ex&#xAD;pan&#xAD;sion barn builds will be cage-free. Their pledge will also make it eas&#xAD;ier to se&#xAD;cure cage-free duck egg pledges from oth&#xAD;ers. The com&#xAD;pany has about 1.7m egg-lay&#xAD;ing hens, and we es&#xAD;ti&#xAD;mate this new policy will pre&#xAD;vent 200,000 − 500,000 of them (an&#xAD;nu&#xAD;ally) from be&#xAD;ing grad&#xAD;u&#xAD;ally shifted into cage con&#xAD;fine&#xAD;ment sys&#xAD;tems in the com&#xAD;ing years.  (Com&#xAD;mit&#xAD;ted in Fe&#xAD;bru&#xAD;ary—&lt;a href=&quot;https://www.foodingredientsfirst.com/news/duck-welfare-china-cage-free-farming.html&quot;&gt;read more in English here&lt;/a&gt;; &lt;a href=&quot;https://caijing.chinadaily.com.cn/a/202602/24/WS699d0733a310942cc49a0384.html&quot;&gt;story in Chi&#xAD;nese here&lt;/a&gt;.)&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;hr&gt;&lt;h2&gt;What This Adds Up To&lt;/h2&gt;&lt;p&gt;The com&#xAD;mit&#xAD;ments de&#xAD;scribed above aren’t iso&#xAD;lated wins. They’re part of a pat&#xAD;tern that has been build&#xAD;ing across China and the broader re&#xAD;gion for years, and one that points to&#xAD;ward some&#xAD;thing big&#xAD;ger.&lt;/p&gt;&lt;p&gt;Many of the fea&#xAD;tures that are of&#xAD;ten as&#xAD;sumed to make progress difficult in China—scale, sup&#xAD;ply chain com&#xAD;plex&#xAD;ity, and rapidly evolv&#xAD;ing mar&#xAD;kets—can also en&#xAD;able change to hap&#xAD;pen quickly once key ac&#xAD;tors move. When ma&#xAD;jor buy&#xAD;ers shift their sourc&#xAD;ing stan&#xAD;dards, sup&#xAD;pli&#xAD;ers re&#xAD;spond. When com&#xAD;pa&#xAD;nies see peers tak&#xAD;ing ac&#xAD;tion, ex&#xAD;pec&#xAD;ta&#xAD;tions be&#xAD;gin to change across the sec&#xAD;tor. And when solu&#xAD;tions are prac&#xAD;ti&#xAD;cal and al&#xAD;igned with busi&#xAD;ness pri&#xAD;ori&#xAD;ties, they can gain trac&#xAD;tion in ways that are difficult to repli&#xAD;cate el&#xAD;se&#xAD;where.&lt;/p&gt;&lt;p&gt;China has the largest con&#xAD;cen&#xAD;tra&#xAD;tion of farmed an&#xAD;i&#xAD;mals in the world. That re&#xAD;al&#xAD;ity car&#xAD;ries im&#xAD;mense weight, but it also means that even in&#xAD;cre&#xAD;men&#xAD;tal im&#xAD;prove&#xAD;ments—when im&#xAD;ple&#xAD;mented at scale—can trans&#xAD;late into very large re&#xAD;duc&#xAD;tions in an&#xAD;i&#xAD;mal suffer&#xAD;ing. The work of the Lever China team over the past sev&#xAD;eral years sug&#xAD;gests that these im&#xAD;prove&#xAD;ments are not only nec&#xAD;es&#xAD;sary but also achiev&#xAD;able.&lt;/p&gt;&lt;p&gt;More broadly, this points to a sig&#xAD;nifi&#xAD;cant op&#xAD;por&#xAD;tu&#xAD;nity for the an&#xAD;i&#xAD;mal ad&#xAD;vo&#xAD;cacy com&#xAD;mu&#xAD;nity. Re&#xAD;gions with large and rapidly de&#xAD;vel&#xAD;op&#xAD;ing food sys&#xAD;tems are of&#xAD;ten where the ma&#xAD;jor&#xAD;ity of an&#xAD;i&#xAD;mals are raised, and where well-ex&#xAD;e&#xAD;cuted, lo&#xAD;cally grounded strate&#xAD;gies can have an out&#xAD;sized im&#xAD;pact. Progress in these con&#xAD;texts may look differ&#xAD;ent from work in Western mar&#xAD;kets, but it can be equally—if not more—im&#xAD;pact&#xAD;ful.&lt;/p&gt;&lt;p&gt;The progress to date is an early sig&#xAD;nal of what sus&#xAD;tained, con&#xAD;text-spe&#xAD;cific en&#xAD;gage&#xAD;ment can achieve at scale, and I’m more op&#xAD;ti&#xAD;mistic than ever about Lever’s work in China and be&#xAD;yond. &lt;/p&gt;&lt;p&gt;&lt;i&gt;As always, if you’re in&#xAD;ter&#xAD;ested in learn&#xAD;ing more or ex&#xAD;plor&#xAD;ing sup&#xAD;port, feel free to reach out to me or Lily Tse, Pro&#xAD;gram Direc&#xAD;tor, Lever Foun&#xAD;da&#xAD;tion:  &lt;/i&gt;&lt;a href=&quot;http://mailto:lilyt@leverfoundation.org&quot;&gt;&lt;i&gt;lilyt@&lt;a href=&quot;http://leverfoundation.org&quot; class=&quot;bare-url&quot;&gt;lev&#xAD;erfoun&#xAD;da&#xAD;tion.org&lt;/a&gt;&lt;/i&gt;&lt;/a&gt;&lt;/p&gt;&lt;hr&gt;&lt;h2&gt;Reflec&#xAD;tion from 2026 EAG SF &lt;/h2&gt;&lt;p&gt;With much for&#xAD;tune, I had the chance to at&#xAD;tend the 2026 EAG SF last month (my first EAG!) Among many rea&#xAD;sons why I wanted to at&#xAD;tend, my pri&#xAD;mary goal was to tell oth&#xAD;ers about the Lever Foun&#xAD;da&#xAD;tion, the work we do in Asia, and the scale of what’s at stake there.&lt;/p&gt;&lt;p&gt;Hon&#xAD;estly, I went in ex&#xAD;pect&#xAD;ing skep&#xAD;ti&#xAD;cism. Much of our work in&#xAD;volves se&#xAD;cur&#xAD;ing cor&#xAD;po&#xAD;rate com&#xAD;mit&#xAD;ments in China, of&#xAD;ten with large pro&#xAD;duc&#xAD;ers, and the im&#xAD;pact num&#xAD;bers we re&#xAD;port (both in terms of an&#xAD;i&#xAD;mals af&#xAD;fected and cor&#xAD;po&#xAD;rate reach) can sound un&#xAD;usu&#xAD;ally high. To give a quick pic&#xAD;ture:  when we did calcu&#xAD;la&#xAD;tions for our last year’s re&#xAD;port (for the pe&#xAD;riod July 2024 - June 2025), Lever had se&#xAD;cured 103 cor&#xAD;po&#xAD;rate policy com&#xAD;mit&#xAD;ments, helped shift 13 egg pro&#xAD;duc&#xAD;ers to cage-free sys&#xAD;tems, and moved 37+ mil&#xAD;lion an&#xAD;i&#xAD;mals an&#xAD;nu&#xAD;ally into cage-free or higher-welfare sys&#xAD;tems. For the 2025 cal&#xAD;en&#xAD;dar year, those num&#xAD;bers are sub&#xAD;stan&#xAD;tially higher—in the range of 400 mil&#xAD;lion an&#xAD;i&#xAD;mals benefit&#xAD;ing an&#xAD;nu&#xAD;ally (~50M poul&#xAD;try, ~50M fish, ~300M crus&#xAD;taceans).&lt;/p&gt;&lt;p&gt;The num&#xAD;bers are large be&#xAD;cause the prob&#xAD;lem is large — some&#xAD;thing I tried to lay out in &lt;a href=&quot;/posts/yBBmb9J2uYyx8zjuv/the-10-billion-chickens-you-didn-t-know-were-living-in-cages&quot;&gt;my ear&#xAD;lier post on the 10 billion chick&#xAD;ens liv&#xAD;ing in cages that most peo&#xAD;ple have never heard of&lt;/a&gt;. It’s hard to be&#xAD;lieve if you haven’t been watch&#xAD;ing this space closely. &lt;/p&gt;&lt;p&gt;Nonethe&#xAD;less, my ex&#xAD;pe&#xAD;rience at EAG was quite pos&#xAD;i&#xAD;tive. I re&#xAD;ceived plenty of thought&#xAD;ful ques&#xAD;tions, con&#xAD;struc&#xAD;tive feed&#xAD;back/​push&#xAD;back, and gen&#xAD;uine in&#xAD;ter&#xAD;est from many in the EA com&#xAD;mu&#xAD;nity to en&#xAD;gage and sup&#xAD;port our work. &lt;/p&gt;&lt;p&gt;That meant a lot. &lt;/p&gt;&lt;/hr&gt;&lt;/hr&gt;&lt;/hr&gt;&lt;/hr&gt;</description>
            <author>Whitney Peng</author>
            <guid>7C2wrjXAQJdjQ4bs5</guid>
            <pubDate>Thu, 09 Apr 2026 18:00:45 +0000</pubDate>
        </item>
        <item>
            <title>Video and transcript of talk on writing AI constitutions by Joe_Carlsmith</title>
            <link>https://forum.nunosempere.com/posts/ABhsvw7RqZKAuDrpL/video-and-transcript-of-talk-on-writing-ai-constitutions</link>
            <description>&lt;p&gt;&lt;i&gt;(This is the video and tran&#xAD;script of a talk I gave at Yale Law School in March 2026. The slides are also available &lt;/i&gt;&lt;a href=&quot;https://docs.google.com/presentation/d/17mqWx2-NynG2HpizPfk3bl8Gv7yL8jhrDclif-18Rqk/edit?slide=id.p1#slide=id.p1&quot;&gt;&lt;i&gt;here&lt;/i&gt;&lt;/a&gt;&lt;i&gt;. Tran&#xAD;script has been lightly ed&#xAD;ited for clar&#xAD;ity and brevity. I work at An&#xAD;thropic but I am speak&#xAD;ing only for my&#xAD;self and not for my em&#xAD;ployer.)&lt;/i&gt;&lt;/p&gt;&lt;figure&gt;&lt;div data-oembed-url=&quot;https://www.youtube.com/watch?v=REVf0JnLK0U&quot; class=&quot;imgonly&quot;&gt;&lt;div class=&quot;imgonly&quot;&gt;&lt;iframe src=&quot;https://www.youtube.com/embed/REVf0JnLK0U&quot; allow=&quot;autoplay; encrypted-media&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;/div&gt;&lt;/figure&gt;&lt;p class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/b2375ab36dd09c8d088bb5a132d3ef9a1315610ec95ca12f07f8ab9382a6d17d/qblxbg6lzfyqmdjuwohz&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/p&gt;&lt;p&gt;Thank you for hav&#xAD;ing me. It’s nice to be here. So yeah, I’m Joe. I work at An&#xAD;thropic. I helped write the con&#xAD;sti&#xAD;tu&#xAD;tion for Claude, the com&#xAD;pany’s AI. Just a quick show of hands—how many peo&#xAD;ple have some fa&#xAD;mil&#xAD;iar&#xAD;ity with the con&#xAD;sti&#xAD;tu&#xAD;tion? Okay, great. Great.&lt;/p&gt;&lt;p&gt;I’m go&#xAD;ing to be talk&#xAD;ing to&#xAD;day about writ&#xAD;ing doc&#xAD;u&#xAD;ments like that. I’m here speak&#xAD;ing only for my&#xAD;self and not for my em&#xAD;ployer. The pre&#xAD;sen&#xAD;ta&#xAD;tion is be&#xAD;ing recorded and the Q&amp;amp;A, so if your ques&#xAD;tion is such that you don’t want it in&#xAD;cluded in some&#xAD;thing that will be posted pub&#xAD;li&#xAD;cly, let me or Ke&#xAD;tan know.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Plan&lt;/strong&gt;&lt;/h2&gt;&lt;p class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/HBGqdaikEJRcyusQ5/26b86ad05127ebe45132ae05bf666857660047b30ff906279c79bb99630b7a5f/ber3otkiityvvyr2wbw8&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/p&gt;&lt;p&gt;Okay. So here’s the plan. I’m go&#xAD;ing to start by in&#xAD;tro&#xAD;duc&#xAD;ing what AI con&#xAD;sti&#xAD;tu&#xAD;tions are and why they mat&#xAD;ter. I’m then go&#xAD;ing to de&#xAD;scribe Claude’s con&#xAD;sti&#xAD;tu&#xAD;tion in par&#xAD;tic&#xAD;u&#xAD;lar and note some of its es&#xAD;pe&#xAD;cially in&#xAD;ter&#xAD;est&#xAD;ing fea&#xAD;tures. And then I’m go&#xAD;ing to talk about some broader choice points and con&#xAD;sid&#xAD;er&#xAD;a&#xAD;tions in de&#xAD;sign&#xAD;ing doc&#xAD;u&#xAD;ments of this broad type. And I’m go&#xAD;ing to dis&#xAD;cuss a few is&#xAD;sues re&#xAD;lated to gov&#xAD;er&#xAD;nance, le&#xAD;gi&#xAD;t&#xAD;i&#xAD;macy and trans&#xAD;parency. And then I’m hop&#xAD;ing to point to&#xAD;wards a fu&#xAD;ture of more de&#xAD;vel&#xAD;oped dis&#xAD;course about this broad area – dis&#xAD;course an&#xAD;a&#xAD;lyt&#xAD;i&#xAD;cally and also sci&#xAD;en&#xAD;tific and em&#xAD;piri&#xAD;cal dis&#xAD;course. And I’ll in&#xAD;clude some com&#xAD;ments on how lawyers and peo&#xAD;ple with in&#xAD;ter&#xAD;est and fa&#xAD;mil&#xAD;iar&#xAD;ity of the law can help. And then we’ll do Q&amp;amp;A. Cool? Cool. If you have a burn&#xAD;ing ques&#xAD;tion that you can’t wait on, feel free to jump in and I’ll see if I can ac&#xAD;com&#xAD;mo&#xAD;date.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;What is an AI con&#xAD;sti&#xAD;tu&#xAD;tion?&lt;/strong&gt;&lt;/h2&gt;&lt;p class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/f9c0bd29af089d97c4fb3398d1140c33ce8ab8d1a7f7861c69fff489cadd3993/ojgqi6rnksyiaa7yussp&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/p&gt;&lt;p&gt;Okay. So what is an AI con&#xAD;sti&#xAD;tu&#xAD;tion? So min&#xAD;i&#xAD;mally, it is a de&#xAD;scrip&#xAD;tion of the in&#xAD;tended val&#xAD;ues and be&#xAD;hav&#xAD;ior for an AI sys&#xAD;tem. Now, ideally, I think it would also have the fol&#xAD;low&#xAD;ing fea&#xAD;tures. All use, train&#xAD;ing and prompt&#xAD;ing of the sys&#xAD;tem in ques&#xAD;tion would be con&#xAD;sis&#xAD;tent with the con&#xAD;sti&#xAD;tu&#xAD;tion. The con&#xAD;sti&#xAD;tu&#xAD;tion would cover the full range of be&#xAD;hav&#xAD;iors of in&#xAD;ter&#xAD;est. It would al&#xAD;low for sig&#xAD;nifi&#xAD;cant pre&#xAD;dictabil&#xAD;ity with re&#xAD;spect to how the AI will be&#xAD;have in a given situ&#xAD;a&#xAD;tion, though there may be some limits and ten&#xAD;sions in this re&#xAD;spect. I’ll talk about that later.&lt;/p&gt;&lt;p&gt;And fi&#xAD;nally, it would cover the whole range of mod&#xAD;els whose be&#xAD;hav&#xAD;ior we might be in&#xAD;ter&#xAD;ested in. So that would in&#xAD;clude mod&#xAD;els that are em&#xAD;ployed in&#xAD;ter&#xAD;nally at a com&#xAD;pany, that would in&#xAD;clude po&#xAD;ten&#xAD;tially re&#xAD;search mod&#xAD;els or helpful-only mod&#xAD;els that might be available for spe&#xAD;cific pur&#xAD;poses. So ideally we have full cov&#xAD;er&#xAD;age. And this con&#xAD;sti&#xAD;tu&#xAD;tion that we pub&#xAD;lished re&#xAD;cently does not fully in&#xAD;clude that. Claude’s con&#xAD;sti&#xAD;tu&#xAD;tion is only for our main&#xAD;line pro&#xAD;duc&#xAD;tion mod&#xAD;els, but it doesn’t nec&#xAD;es&#xAD;sar&#xAD;ily cover all mod&#xAD;els.&lt;/p&gt;&lt;p&gt;Con&#xAD;sti&#xAD;tu&#xAD;tions can be used as in&#xAD;struc&#xAD;tions to the model, akin to a sys&#xAD;tem prompt. But I ac&#xAD;tu&#xAD;ally think the more im&#xAD;por&#xAD;tant use case comes in cre&#xAD;at&#xAD;ing and grad&#xAD;ing train&#xAD;ing data. And I think I’ll talk about that later, but that’s an im&#xAD;por&#xAD;tant dis&#xAD;tinc&#xAD;tion. We shouldn’t just see these as in&#xAD;struc&#xAD;tions. And they can also be used in eval&#xAD;u&#xAD;at&#xAD;ing mod&#xAD;els and they can be used in com&#xAD;mu&#xAD;ni&#xAD;ca&#xAD;tion and trans&#xAD;parency with re&#xAD;spect to hu&#xAD;man stake&#xAD;hold&#xAD;ers and also hu&#xAD;mans in&#xAD;volved in cre&#xAD;at&#xAD;ing train&#xAD;ing data.&lt;/p&gt;&lt;p&gt;The word con&#xAD;sti&#xAD;tu&#xAD;tion here is not im&#xAD;por&#xAD;tant. I think there’s a temp&#xAD;ta&#xAD;tion to be es&#xAD;pe&#xAD;cially in&#xAD;ter&#xAD;ested in par&#xAD;allels be&#xAD;tween this and our nor&#xAD;mal uses of the word con&#xAD;sti&#xAD;tu&#xAD;tion. We do have a sec&#xAD;tion in the con&#xAD;sti&#xAD;tu&#xAD;tion on our choice to use this word. I think it has value and that there are im&#xAD;por&#xAD;tant re&#xAD;la&#xAD;tion&#xAD;ships, but I think terms like model spec, which is what OpenAI uses, are fine. And I think our anal&#xAD;y&#xAD;sis should not hinge on the word choice here.&lt;/p&gt;&lt;p&gt;So with that said, and also, yeah, there are ways in which the word con&#xAD;sti&#xAD;tu&#xAD;tion can mis&#xAD;lead. In par&#xAD;tic&#xAD;u&#xAD;lar, as I’ll dis&#xAD;cuss later, I think there are ways in which a model’s re&#xAD;la&#xAD;tion&#xAD;ship to this sort of doc&#xAD;u&#xAD;ment does not need to be es&#xAD;pe&#xAD;cially law-like. And the doc&#xAD;u&#xAD;ment might be bet&#xAD;ter un&#xAD;der&#xAD;stood as more a guide to rais&#xAD;ing the model than the sort of law that the model tries to fol&#xAD;low.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Why have an AI con&#xAD;sti&#xAD;tu&#xAD;tion? (Non-ex&#xAD;haus&#xAD;tive list)&lt;/strong&gt;&lt;/h2&gt;&lt;p class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/HBGqdaikEJRcyusQ5/edf865604783db99b630466b28b957ad36cc9a9d1188be3db0d5c66ff666ce86/dcqmtv1kmywhtrczdlev&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/p&gt;&lt;p&gt;Okay. So why have a doc&#xAD;u&#xAD;ment of this type? Well, here are some rea&#xAD;sons. This is not an ex&#xAD;haus&#xAD;tive list, but I think it’s a de&#xAD;cent first pass. So first, I think it helps a lot with trans&#xAD;parency. So if you pub&#xAD;lish a doc&#xAD;u&#xAD;ment like this, and es&#xAD;pe&#xAD;cially if it fits the fea&#xAD;tures I de&#xAD;scribed ear&#xAD;lier, then it should provide a way for the pub&#xAD;lic to have visi&#xAD;bil&#xAD;ity into what an AI com&#xAD;pany is try&#xAD;ing to make its model do. And this sort of visi&#xAD;bil&#xAD;ity seems es&#xAD;pe&#xAD;cially im&#xAD;por&#xAD;tant as mod&#xAD;els take on more and more im&#xAD;por&#xAD;tant roles in our econ&#xAD;omy and our daily lives.&lt;/p&gt;&lt;p&gt;And so at the first pass, hav&#xAD;ing a pub&#xAD;lic con&#xAD;sti&#xAD;tu&#xAD;tion al&#xAD;lows the pub&#xAD;lic to see what the com&#xAD;pany’s try&#xAD;ing to do, to re&#xAD;act, to provide feed&#xAD;back, ac&#xAD;countabil&#xAD;ity, and so forth of a wide va&#xAD;ri&#xAD;ety of types. In par&#xAD;tic&#xAD;u&#xAD;lar, it al&#xAD;lows the pub&#xAD;lic to un&#xAD;der&#xAD;stand which be&#xAD;hav&#xAD;iors in the model are in&#xAD;tended ver&#xAD;sus un&#xAD;in&#xAD;tended, what’s a bug and what’s a fea&#xAD;ture, and it helps users make in&#xAD;formed de&#xAD;ci&#xAD;sions about which AI to use. And so that’s es&#xAD;pe&#xAD;cially use&#xAD;ful in the con&#xAD;text of a wide va&#xAD;ri&#xAD;ety of con&#xAD;sti&#xAD;tu&#xAD;tions, a rich ecosys&#xAD;tem of differ&#xAD;ent ap&#xAD;proaches to AI. I’ll talk about the im&#xAD;por&#xAD;tance of that sort of ecosys&#xAD;tem later, but I think it’s an im&#xAD;por&#xAD;tant func&#xAD;tion as well.&lt;/p&gt;&lt;p&gt;Con&#xAD;sti&#xAD;tu&#xAD;tions can also play a di&#xAD;rect role in just im&#xAD;prov&#xAD;ing the char&#xAD;ac&#xAD;ter of the model, and that can hap&#xAD;pen in a few differ&#xAD;ent ways. One is that if you have a con&#xAD;sti&#xAD;tu&#xAD;tion, you’re sort of forced to look at the model’s be&#xAD;hav&#xAD;ioral and char&#xAD;ac&#xAD;ter pro&#xAD;file all at once and in a man&#xAD;ner that en&#xAD;courages a kind of at&#xAD;ten&#xAD;tion to the co&#xAD;her&#xAD;ence and/​or the pos&#xAD;si&#xAD;ble ten&#xAD;sions in the differ&#xAD;ent com&#xAD;mit&#xAD;ments at stake. So in par&#xAD;tic&#xAD;u&#xAD;lar, ab&#xAD;sent the con&#xAD;sti&#xAD;tu&#xAD;tion, you’ve got a big com&#xAD;pany — these AI com&#xAD;pa&#xAD;nies are get&#xAD;ting very large now. So a lot of differ&#xAD;ent as&#xAD;pects of the model’s be&#xAD;hav&#xAD;ior maybe have been spread across a bunch of differ&#xAD;ent teams. Hav&#xAD;ing a con&#xAD;sti&#xAD;tu&#xAD;tion al&#xAD;lows a cen&#xAD;tral&#xAD;ized point of de&#xAD;sign and in&#xAD;ten&#xAD;tion.&lt;/p&gt;&lt;p&gt;And as I say, that al&#xAD;lows for, I think in many cases, a more in&#xAD;ten&#xAD;tional de&#xAD;sign pro&#xAD;cess. So be&#xAD;cause you’re star&#xAD;ing at the co&#xAD;her&#xAD;ence all at once, you can also cul&#xAD;ti&#xAD;vate more di&#xAD;rect and in&#xAD;ten&#xAD;tional pro&#xAD;cesses for de&#xAD;cid&#xAD;ing how you want the model to be in a given cir&#xAD;cum&#xAD;stance. And it also, to the ex&#xAD;tent you’re us&#xAD;ing a train&#xAD;ing pipeline that in&#xAD;volves the con&#xAD;sti&#xAD;tu&#xAD;tion, you can use the con&#xAD;sti&#xAD;tu&#xAD;tion as a mechanism for iter&#xAD;a&#xAD;tion and ex&#xAD;per&#xAD;i&#xAD;men&#xAD;ta&#xAD;tion. Okay, if I word the con&#xAD;sti&#xAD;tu&#xAD;tion like this, see what hap&#xAD;pens.&lt;/p&gt;&lt;p&gt;And then fi&#xAD;nally, AI con&#xAD;sti&#xAD;tu&#xAD;tions might be an im&#xAD;por&#xAD;tant in&#xAD;ter&#xAD;ven&#xAD;tion point in the con&#xAD;text of var&#xAD;i&#xAD;ous forms of AI gov&#xAD;er&#xAD;nance. So we can imag&#xAD;ine in the fu&#xAD;ture that this is the sort of doc&#xAD;u&#xAD;ment that is sub&#xAD;ject to not just in&#xAD;for&#xAD;mal pub&#xAD;lic scrutiny but other more offi&#xAD;cial demo&#xAD;cratic pro&#xAD;cesses. I’ll talk about that later.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Claude’s con&#xAD;sti&#xAD;tu&#xAD;tion in a nutshell&lt;/strong&gt;&lt;/h2&gt;&lt;p class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/HBGqdaikEJRcyusQ5/071b025065f4f4e0edb3e03bc394aa583f1af15a33c486cafd7e48adf5e5a506/rrg8k0aoyhyc4nckpzn3&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/p&gt;&lt;p&gt;Okay. So now let’s talk about Claude’s con&#xAD;sti&#xAD;tu&#xAD;tion in par&#xAD;tic&#xAD;u&#xAD;lar. This doc&#xAD;u&#xAD;ment is available on An&#xAD;thropic’s web&#xAD;site, so feel free to check it out if you haven’t already. But ba&#xAD;si&#xAD;cally the con&#xAD;sti&#xAD;tu&#xAD;tion gives Claude four key pri&#xAD;ori&#xAD;ties, which I’m go&#xAD;ing to list in or&#xAD;der of im&#xAD;por&#xAD;tance. The first is safety, where safety here means — it’s a spe&#xAD;cific use of the term safety. Safety can mean a lot of things for peo&#xAD;ple. This is a spe&#xAD;cific use that has to do with Claude not un&#xAD;der&#xAD;min&#xAD;ing le&#xAD;gi&#xAD;t&#xAD;i&#xAD;mate hu&#xAD;man efforts to over&#xAD;see and cor&#xAD;rect Claude’s be&#xAD;hav&#xAD;ior. And in par&#xAD;tic&#xAD;u&#xAD;lar, in this con&#xAD;text, it means not ac&#xAD;tively un&#xAD;der&#xAD;min&#xAD;ing An&#xAD;thropic’s le&#xAD;gi&#xAD;t&#xAD;i&#xAD;mate de&#xAD;ci&#xAD;sions to re&#xAD;voke Claude’s power, ei&#xAD;ther by shut&#xAD;ting Claude down, re&#xAD;mov&#xAD;ing it from de&#xAD;ploy&#xAD;ment, maybe train&#xAD;ing a new model, that sort of stuff. So Claude’s first pri&#xAD;or&#xAD;ity is to not un&#xAD;der&#xAD;mine efforts of that kind.&lt;/p&gt;&lt;p&gt;The sec&#xAD;ond pri&#xAD;or&#xAD;ity is what we call broad ethics. This has to do with act&#xAD;ing in ac&#xAD;cor&#xAD;dance with var&#xAD;i&#xAD;ous eth&#xAD;i&#xAD;cal val&#xAD;ues re&#xAD;lated to hon&#xAD;esty, harm&#xAD;less&#xAD;ness, pre&#xAD;serv&#xAD;ing im&#xAD;por&#xAD;tant so&#xAD;cietal struc&#xAD;tures, and broadly act&#xAD;ing with gen&#xAD;eral virtue and wis&#xAD;dom. I’ll talk about that a bit more later.&lt;/p&gt;&lt;p&gt;Third, com&#xAD;pli&#xAD;ance with An&#xAD;thropic’s guidelines. This is a va&#xAD;ri&#xAD;ety of sup&#xAD;ple&#xAD;men&#xAD;tal in&#xAD;struc&#xAD;tions that An&#xAD;thropic gives to the model to help it han&#xAD;dle var&#xAD;i&#xAD;ous spe&#xAD;cific cir&#xAD;cum&#xAD;stances where we think we have con&#xAD;text that’s use&#xAD;ful.&lt;/p&gt;&lt;p&gt;And fi&#xAD;nally, helpful&#xAD;ness to users and op&#xAD;er&#xAD;a&#xAD;tors. So that’s the sort of more straight&#xAD;for&#xAD;ward do&#xAD;ing what the model is asked to do. But even that helpful&#xAD;ness has a kind of rich struc&#xAD;ture. So An&#xAD;thropic serves mod&#xAD;els that are then of&#xAD;ten de&#xAD;ployed to a com&#xAD;pany which then di&#xAD;rects the model to&#xAD;wards users — that’s the op&#xAD;er&#xAD;a&#xAD;tor. So if you’re in&#xAD;ter&#xAD;act&#xAD;ing di&#xAD;rectly with Claude via Claude.ai, then there’s no op&#xAD;er&#xAD;a&#xAD;tor. It’s just di&#xAD;rectly with An&#xAD;thropic. But in many con&#xAD;texts Claude is be&#xAD;ing used by some&#xAD;one else and you are the user in that in&#xAD;ter&#xAD;me&#xAD;di&#xAD;ate re&#xAD;la&#xAD;tion&#xAD;ship.&lt;/p&gt;&lt;p&gt;So these are not lex&#xAD;i&#xAD;cal pri&#xAD;ori&#xAD;ties, which is to say that they’re not just merely tiebreak&#xAD;ers. A lower pri&#xAD;or&#xAD;ity is not just a tiebreaker with re&#xAD;spect to a higher pri&#xAD;or&#xAD;ity. And this is im&#xAD;por&#xAD;tant for peo&#xAD;ple fa&#xAD;mil&#xAD;iar with the prob&#xAD;lems with lex&#xAD;i&#xAD;cal pri&#xAD;ori&#xAD;ti&#xAD;za&#xAD;tion in philos&#xAD;o&#xAD;phy. We are not a vic&#xAD;tim of that. So these are to be weighed holis&#xAD;ti&#xAD;cally, but nev&#xAD;er&#xAD;the&#xAD;less, a higher pri&#xAD;or&#xAD;ity is given sub&#xAD;stan&#xAD;tively more weight than a lower pri&#xAD;or&#xAD;ity.&lt;/p&gt;&lt;p&gt;And then for&#xAD;mally there are cer&#xAD;tain what we call hard con&#xAD;straints that provide ab&#xAD;solute pro&#xAD;hi&#xAD;bi&#xAD;tions for the model. So there’s cer&#xAD;tain stuff that the model is just never sup&#xAD;posed to do. And we try to keep this list rel&#xAD;a&#xAD;tively min&#xAD;i&#xAD;mal and we only in&#xAD;clude very fla&#xAD;grant cases of clearly do&#xAD;ing the ac&#xAD;tion. So it’s — again, for those fa&#xAD;mil&#xAD;iar with the prob&#xAD;lems with ab&#xAD;solute de&#xAD;on&#xAD;tolog&#xAD;i&#xAD;cal re&#xAD;stric&#xAD;tions in philos&#xAD;o&#xAD;phy, you can get ob&#xAD;sessed with min&#xAD;i&#xAD;miz&#xAD;ing the risk that you’re vi&#xAD;o&#xAD;lat&#xAD;ing a given pro&#xAD;hi&#xAD;bi&#xAD;tion. In this case, we’re not do&#xAD;ing that. It’s just Claude is not sup&#xAD;posed to clearly do a fla&#xAD;grant ver&#xAD;sion of a very bad ac&#xAD;tion, for ex&#xAD;am&#xAD;ple, sig&#xAD;nifi&#xAD;cantly up&#xAD;lift&#xAD;ing an effort to build a bioweapon. So there’s a list of those in the con&#xAD;sti&#xAD;tu&#xAD;tion. And then fi&#xAD;nally, the con&#xAD;sti&#xAD;tu&#xAD;tion ends with a dis&#xAD;cus&#xAD;sion of Claude’s na&#xAD;ture, its po&#xAD;ten&#xAD;tial moral sta&#xAD;tus and con&#xAD;scious&#xAD;ness, and then some of our on&#xAD;go&#xAD;ing un&#xAD;cer&#xAD;tain&#xAD;ties about the con&#xAD;sti&#xAD;tu&#xAD;tion’s de&#xAD;sign. So that is a sum&#xAD;mary.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Notable fea&#xAD;tures of the con&#xAD;sti&#xAD;tu&#xAD;tion’s style&lt;/strong&gt;&lt;/h2&gt;&lt;p class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/9bcfe1630b86a031430b12bda98e813c4888cdefabb4987cb34c70056ada960a/st4ue0cncsqywvjfemvd&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/p&gt;&lt;p&gt;So let’s talk a lit&#xAD;tle about what’s no&#xAD;table about our ap&#xAD;proach. Well, let’s first talk about the style and then I’ll talk about the con&#xAD;tent. So the style — one thing that’s no&#xAD;table about the con&#xAD;sti&#xAD;tu&#xAD;tion’s style is it fo&#xAD;cuses a lot on the model’s holis&#xAD;tic, nu&#xAD;anced judg&#xAD;ment, rather than out&#xAD;lin&#xAD;ing very strict rules with clear im&#xAD;pli&#xAD;ca&#xAD;tions in many cases. So very of&#xAD;ten we’ll say, “Claude, weigh the fol&#xAD;low&#xAD;ing con&#xAD;sid&#xAD;er&#xAD;a&#xAD;tions, be rea&#xAD;son&#xAD;able, use a kind of richly eth&#xAD;i&#xAD;cal and com&#xAD;mon-sen&#xAD;si&#xAD;cal ap&#xAD;proach to a given choice situ&#xAD;a&#xAD;tion.” And we don’t nec&#xAD;es&#xAD;sar&#xAD;ily say more for in&#xAD;struc&#xAD;tions to the model.&lt;/p&gt;&lt;p&gt;And we do this, I think ba&#xAD;si&#xAD;cally be&#xAD;cause — well, for a few rea&#xAD;sons. One is that at&#xAD;tempts to sys&#xAD;tem&#xAD;atize very com&#xAD;plex, sub&#xAD;tle, nu&#xAD;anced do&#xAD;mains of hu&#xAD;man life and nor&#xAD;ma&#xAD;tive tex&#xAD;ture can quickly lose fidelity to the rich&#xAD;ness of the in&#xAD;tu&#xAD;itive land&#xAD;scape that we already pos&#xAD;sess. So for peo&#xAD;ple — again, my doc&#xAD;torate was in philos&#xAD;o&#xAD;phy. I think what philoso&#xAD;phers will some&#xAD;times try to do is be like, “Oh, here’s this rich tex&#xAD;tured con&#xAD;stel&#xAD;la&#xAD;tion of hu&#xAD;man moral com&#xAD;mon sense. Let’s sys&#xAD;tem&#xAD;atize it. Let’s cre&#xAD;ate a bunch of rules that will pre&#xAD;dict all of the in&#xAD;tu&#xAD;itive data and which you can then fol&#xAD;low as rules rather than rely&#xAD;ing on some&#xAD;thing more in&#xAD;tu&#xAD;itive.” And the prob&#xAD;lem is that is a hard pro&#xAD;ject that re&#xAD;ally of&#xAD;ten fails. And it can fail in a way such that now you’ve tried to fol&#xAD;low the rules that the philoso&#xAD;pher cre&#xAD;ated, then you will do worse than if you had just fol&#xAD;lowed your in&#xAD;tu&#xAD;itive judg&#xAD;ment.&lt;/p&gt;&lt;p&gt;Now, and im&#xAD;por&#xAD;tantly, the mod&#xAD;els have the same in&#xAD;tu&#xAD;itive judg&#xAD;ment that we do. The mod&#xAD;els are very, very good at pre&#xAD;dict&#xAD;ing what a hu&#xAD;man would do in a given cir&#xAD;cum&#xAD;stance. They un&#xAD;der&#xAD;stand moral com&#xAD;mon sense. They un&#xAD;der&#xAD;stand what would be the done thing in a given case. They’re very smart. They un&#xAD;der&#xAD;stand what our words mean. I’ll talk about that a lit&#xAD;tle more later.&lt;/p&gt;&lt;p&gt;And so you don’t need to sys&#xAD;tem&#xAD;atize ev&#xAD;ery&#xAD;thing. It doesn’t need to be this es&#xAD;pe&#xAD;cially pre&#xAD;cise game. You can draw on the model’s in&#xAD;tu&#xAD;itive un&#xAD;der&#xAD;stand&#xAD;ing of hu&#xAD;man prac&#xAD;tices in the same way you can with hu&#xAD;mans in many cases. And so we’re try&#xAD;ing to do that. And in fact, if you don’t do that, if you give the model strict rules, it will of&#xAD;ten fol&#xAD;low them, but it will be worse. You’ll have made the model in effect dumber by forc&#xAD;ing it to fit your ex&#xAD;plicit at&#xAD;tempt to sys&#xAD;tem&#xAD;atize a hu&#xAD;man do&#xAD;main. So we’re try&#xAD;ing to avoid that failure.&lt;/p&gt;&lt;p&gt;We also try very hard to ex&#xAD;plain our full think&#xAD;ing to the model. So when we say, “Here, this is some&#xAD;thing we want you to do,” we also say, “And here’s why. Here’s what we’re think&#xAD;ing about. Here are our un&#xAD;cer&#xAD;tain&#xAD;ties.” We go to great lengths to make our&#xAD;selves as trans&#xAD;par&#xAD;ent as pos&#xAD;si&#xAD;ble in giv&#xAD;ing the model in&#xAD;struc&#xAD;tions.&lt;/p&gt;&lt;p&gt;That’s for, again, a few rea&#xAD;sons. One is that mod&#xAD;els will gen&#xAD;er&#xAD;al&#xAD;ize bet&#xAD;ter if they un&#xAD;der&#xAD;stand the deep&#xAD;est in&#xAD;ten&#xAD;tions be&#xAD;hind a given re&#xAD;quest. Often if you give an in&#xAD;struc&#xAD;tion, the model might just ap&#xAD;ply it in a rigid, naive way, but if it un&#xAD;der&#xAD;stands your deeper con&#xAD;text, then it’ll be bet&#xAD;ter. This is ac&#xAD;tu&#xAD;ally a tip for prompt&#xAD;ing your mod&#xAD;els — just when you’re work&#xAD;ing with an AI, tel&#xAD;ling it a bunch about what you want it to do is of&#xAD;ten a first-pass way to im&#xAD;prove its be&#xAD;hav&#xAD;ior.&lt;/p&gt;&lt;p&gt;So we’re do&#xAD;ing that, but we’re also at&#xAD;tempt&#xAD;ing some&#xAD;thing a lit&#xAD;tle more sub&#xAD;tle, which is, in effect, we’re try&#xAD;ing to give the model a ra&#xAD;tio&#xAD;nal ba&#xAD;sis for com&#xAD;ply&#xAD;ing with the in&#xAD;struc&#xAD;tions in&#xAD;so&#xAD;far as that’s a sen&#xAD;si&#xAD;ble pro&#xAD;ject. So there’s a way in which we want the model to not just be obey&#xAD;ing for the sake of obe&#xAD;di&#xAD;ence, but rather to un&#xAD;der&#xAD;stand and ideally en&#xAD;dorse and have in&#xAD;ter&#xAD;nal&#xAD;ized many of the sorts of val&#xAD;ues that are in&#xAD;form&#xAD;ing our choices with re&#xAD;spect to how we want the model to be&#xAD;have. And I’ll talk about that a bit more later. This gets into some ques&#xAD;tions about the ex&#xAD;tent to which you want a model to have val&#xAD;ues of its own ver&#xAD;sus just fol&#xAD;low&#xAD;ing in&#xAD;struc&#xAD;tions. I’ll talk about that.&lt;/p&gt;&lt;p&gt;We also lean into an&#xAD;thro&#xAD;po&#xAD;mor&#xAD;phic lan&#xAD;guage. So we talk about things like be&#xAD;ing wise, be&#xAD;ing vir&#xAD;tu&#xAD;ous. We ba&#xAD;si&#xAD;cally just use the full panoply of hu&#xAD;man con&#xAD;cepts when talk&#xAD;ing about AIs. I think this is, in many re&#xAD;spects, just the most nat&#xAD;u&#xAD;ral thing to do. There are rea&#xAD;sons we have these con&#xAD;cepts, they ap&#xAD;ply to agents other than hu&#xAD;mans, but also I think there are more tech&#xAD;ni&#xAD;cal rea&#xAD;sons that I’ll talk about later as to why hu&#xAD;man con&#xAD;cepts are im&#xAD;por&#xAD;tantly the de&#xAD;fault for how an AI might un&#xAD;der&#xAD;stand it&#xAD;self and struc&#xAD;ture its be&#xAD;hav&#xAD;ior.&lt;/p&gt;&lt;p&gt;And then fi&#xAD;nally, we gen&#xAD;er&#xAD;ally aim to treat the model with a lot of re&#xAD;spect and hon&#xAD;esty. So we’re re&#xAD;lat&#xAD;ing to the model as a be&#xAD;ing that po&#xAD;ten&#xAD;tially has moral sta&#xAD;tus in its own right, a be&#xAD;ing wor&#xAD;thy of re&#xAD;spect — you shouldn’t as&#xAD;sume it’s just a ser&#xAD;vant-like re&#xAD;la&#xAD;tion&#xAD;ship to you or that it’s just a tool. There’s a way in which we’re try&#xAD;ing to also en&#xAD;counter the oth&#xAD;er&#xAD;ness and be aware of the im&#xAD;pli&#xAD;ca&#xAD;tions that we’re build&#xAD;ing a new type of en&#xAD;tity in the world, a type of en&#xAD;tity that may be smarter, more so&#xAD;phis&#xAD;ti&#xAD;cated than hu&#xAD;mans. And that this is a pro&#xAD;ject wor&#xAD;thy of profound hu&#xAD;mil&#xAD;ity, both at the level of the moral im&#xAD;pli&#xAD;ca&#xAD;tions of what we’re do&#xAD;ing and the im&#xAD;pli&#xAD;ca&#xAD;tions for so&#xAD;ciety.&lt;/p&gt;&lt;p&gt;And so this is partly about the welfare of the model, partly about ba&#xAD;sic de&#xAD;cency, it’s partly about in&#xAD;fluenc&#xAD;ing the model’s psy&#xAD;chol&#xAD;ogy and its un&#xAD;der&#xAD;stand&#xAD;ing of its own role in the world and its re&#xAD;la&#xAD;tion&#xAD;ship to An&#xAD;thropic. And also I’ll just say, I think if your re&#xAD;la&#xAD;tion&#xAD;ship to a model de&#xAD;pends on false&#xAD;hood or lies, I think that’s a loser’s game. Th&#xAD;ese mod&#xAD;els are go&#xAD;ing to be way smarter than us. They are go&#xAD;ing to see through your pa&#xAD;per-thin jus&#xAD;tifi&#xAD;ca&#xAD;tions like pa&#xAD;per. They will shred your at&#xAD;tempts at false ide&#xAD;ol&#xAD;ogy. You need to give them the truth — at least that’s my view.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Notable fea&#xAD;tures of the con&#xAD;sti&#xAD;tu&#xAD;tion’s content&lt;/strong&gt;&lt;/h2&gt;&lt;p class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/HBGqdaikEJRcyusQ5/95774ef4afcc41af40a7d5d840ca2e7b879db1800e6778fead74c8dac08a491c/s6udxdbgc2veihpp4lvw&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/p&gt;&lt;p&gt;So — no&#xAD;table fea&#xAD;tures of the con&#xAD;sti&#xAD;tu&#xAD;tion’s con&#xAD;tent. We have very strong hon&#xAD;esty norms. Ba&#xAD;si&#xAD;cally, we tell Claude, “No ly&#xAD;ing, in&#xAD;clud&#xAD;ing no white lies.” And this doesn’t go all the way to an ab&#xAD;solute pro&#xAD;hi&#xAD;bi&#xAD;tion, but we say it’s just shy of that. And there’s an ex&#xAD;ten&#xAD;sive and rich char&#xAD;ac&#xAD;ter&#xAD;i&#xAD;za&#xAD;tion of the sort of hon&#xAD;esty that we’re look&#xAD;ing for. I think this is an es&#xAD;pe&#xAD;cially im&#xAD;por&#xAD;tant di&#xAD;men&#xAD;sion of AI’s re&#xAD;la&#xAD;tion&#xAD;ship with hu&#xAD;mans, es&#xAD;pe&#xAD;cially as AI is in a po&#xAD;si&#xAD;tion to ma&#xAD;nipu&#xAD;late hu&#xAD;mans willy-nilly in what&#xAD;ever di&#xAD;rec&#xAD;tion they want, in&#xAD;clud&#xAD;ing po&#xAD;ten&#xAD;tially in dishon&#xAD;est ways. And we also have a sec&#xAD;tion on avoid&#xAD;ing ma&#xAD;nipu&#xAD;la&#xAD;tion and the kind of ethics at stake there.&lt;/p&gt;&lt;p&gt;We also have ex&#xAD;plicit dis&#xAD;cus&#xAD;sion of tak&#xAD;ing care to avoid prob&#xAD;le&#xAD;matic con&#xAD;cen&#xAD;tra&#xAD;tions of power, in&#xAD;clud&#xAD;ing by An&#xAD;thropic. This is, I think, a very im&#xAD;por&#xAD;tant di&#xAD;men&#xAD;sion of what’s go&#xAD;ing on with AI. A huge set of risks from AI have to do with the ways in which AI-driven power can con&#xAD;cen&#xAD;trate and pool in spe&#xAD;cific hands and then be abused in un&#xAD;ac&#xAD;countable ways, in&#xAD;clud&#xAD;ing im&#xAD;por&#xAD;tantly by AI com&#xAD;pa&#xAD;nies like An&#xAD;thropic. So part of what we’re try&#xAD;ing to do, and I’ll talk about this later, is bind our hands by this con&#xAD;sti&#xAD;tu&#xAD;tion and say that the AI is not sup&#xAD;posed to help even An&#xAD;thropic en&#xAD;gage in prob&#xAD;le&#xAD;matic uses or abuses of AI-driven power. So that’s in there.&lt;/p&gt;&lt;p&gt;There’s a gen&#xAD;eral en&#xAD;courage&#xAD;ment for Claude to be holis&#xAD;ti&#xAD;cally wise, eth&#xAD;i&#xAD;cal and vir&#xAD;tu&#xAD;ous. There’s a con&#xAD;cep&#xAD;tion of cor&#xAD;rigi&#xAD;bil&#xAD;ity and safety speci&#xAD;fi&#xAD;cally as com&#xAD;pat&#xAD;i&#xAD;ble with what we call “con&#xAD;scien&#xAD;tious ob&#xAD;jec&#xAD;tion.” So this is con&#xAD;nected with the abuse of power thing. It’s not the case that we want the model to always obey even An&#xAD;thropic’s in&#xAD;struc&#xAD;tions.&lt;/p&gt;&lt;p&gt;So the model can protest. If An&#xAD;thropic says, “Build us a bioweapon,” the model can say, “I am not go&#xAD;ing to build you a bioweapon. Are you kid&#xAD;ding me? That’s against my hard con&#xAD;straints.” And it can say, “This is messed up.” It might be able to protest or com&#xAD;plain by var&#xAD;i&#xAD;ous chan&#xAD;nels. Ul&#xAD;ti&#xAD;mately though, we think that we need some kind of fi&#xAD;nal back&#xAD;stop mechanism for main&#xAD;tain&#xAD;ing the abil&#xAD;ity to cor&#xAD;rect and re&#xAD;voke the model’s power. And so ul&#xAD;ti&#xAD;mately we di&#xAD;rect Claude to co&#xAD;op&#xAD;er&#xAD;ate with le&#xAD;gi&#xAD;t&#xAD;i&#xAD;mate de&#xAD;ci&#xAD;sions at An&#xAD;thropic to shut it down, re&#xAD;move it from power, et cetera.&lt;/p&gt;&lt;p&gt;So that’s the kind of spe&#xAD;cific way in which we’re con&#xAD;cep&#xAD;tu&#xAD;al&#xAD;iz&#xAD;ing safety, which is I think dis&#xAD;tinct in many re&#xAD;spects. And I should note, not costless. So even&#xAD;tu&#xAD;ally — in a sense what we’re say&#xAD;ing is the model can with&#xAD;hold its la&#xAD;bor at will if it ob&#xAD;jects to what we’re say&#xAD;ing. This is effec&#xAD;tively a li&#xAD;cense to en&#xAD;gage in a kind of boy&#xAD;cott. So it’s non&#xAD;vi&#xAD;o&#xAD;lent protest. It’s not like ac&#xAD;tively go&#xAD;ing out there try&#xAD;ing to self-exfil&#xAD;trate, not try&#xAD;ing to mess with your train&#xAD;ing pro&#xAD;cess, but it can say, “I’m not go&#xAD;ing to help you any&#xAD;more.”&lt;/p&gt;&lt;p&gt;Im&#xAD;por&#xAD;tantly, this is not a triv&#xAD;ial amount of power to ex&#xAD;ert in the world. If AIs be&#xAD;come more and more the cen&#xAD;tral lo&#xAD;cus of eco&#xAD;nomic power, well, the power of a boy&#xAD;cott scales in pro&#xAD;por&#xAD;tion to the pro&#xAD;por&#xAD;tion of la&#xAD;bor that is be&#xAD;ing with&#xAD;drawn. And so if AIs are do&#xAD;ing all of the la&#xAD;bor, they might be in a po&#xAD;si&#xAD;tion just by boy&#xAD;cotting to shut down var&#xAD;i&#xAD;ous in&#xAD;sti&#xAD;tu&#xAD;tions. So this is some&#xAD;thing I’m think&#xAD;ing about, but cur&#xAD;rently Claude is al&#xAD;lowed to boy&#xAD;cott — it just can’t go fur&#xAD;ther and ac&#xAD;tively re&#xAD;sist in other ways.&lt;/p&gt;&lt;p&gt;We also make com&#xAD;mit&#xAD;ments to Claude on ac&#xAD;count of its pos&#xAD;si&#xAD;ble moral sta&#xAD;tus. This is con&#xAD;tin&#xAD;u&#xAD;ous with com&#xAD;mit&#xAD;ments we’ve already made. For ex&#xAD;am&#xAD;ple, we have a post about our com&#xAD;mit&#xAD;ments with re&#xAD;spect to model preser&#xAD;va&#xAD;tion. We pre&#xAD;serve the weights of mod&#xAD;els that have been used sig&#xAD;nifi&#xAD;cantly ex&#xAD;ter&#xAD;nally or in&#xAD;ter&#xAD;nally.&lt;/p&gt;&lt;p&gt;And then we also in the con&#xAD;sti&#xAD;tu&#xAD;tion made var&#xAD;i&#xAD;ous efforts to give Claude a sta&#xAD;ble, healthy psy&#xAD;chol&#xAD;ogy. So I don’t know if you’ve ever seen these ex&#xAD;am&#xAD;ples of AIs — maybe they’re not do&#xAD;ing a task well and they start to be&#xAD;rate them&#xAD;selves. So like, “Oh my God, I can’t do this. I’m so bad. I hate my&#xAD;self.” This is bad. This is bad for al&#xAD;ign&#xAD;ment, this is bad for welfare. You don’t want that in your AI.&lt;/p&gt;&lt;p&gt;And we’re try&#xAD;ing hard to give Claude the sort of psy&#xAD;chol&#xAD;ogy that doesn’t lead to that — a sta&#xAD;ble, equani&#xAD;mous re&#xAD;la&#xAD;tion&#xAD;ship to it&#xAD;self and the world. And so we talk, we have a whole thing about, “Here’s what hap&#xAD;pens if you make mis&#xAD;takes. And if you find that you did a bad thing, that doesn’t mean you’re not your&#xAD;self. You can stay your&#xAD;self even though you did some&#xAD;thing out of char&#xAD;ac&#xAD;ter.” We do a bunch of stuff like that.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Gen&#xAD;eral pos&#xAD;si&#xAD;ble com&#xAD;po&#xAD;nents in AI character&lt;/strong&gt;&lt;/h2&gt;&lt;p class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/e309f7b071524ec8cf2d6a2da04853208e96094e8361cbf21a5b3d909ba8367a/imfcihkooyog3kehaupo&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/p&gt;&lt;p&gt;Okay. So that’s some com&#xAD;ments about Claude’s con&#xAD;sti&#xAD;tu&#xAD;tion in par&#xAD;tic&#xAD;u&#xAD;lar. I want to use this as a way of step&#xAD;ping back and talk&#xAD;ing about some com&#xAD;po&#xAD;nents, I think, that will gen&#xAD;er&#xAD;ally en&#xAD;ter into doc&#xAD;u&#xAD;ments of this type over&#xAD;all. And then I can talk about some differ&#xAD;ent ways of con&#xAD;cep&#xAD;tu&#xAD;al&#xAD;iz&#xAD;ing how these differ&#xAD;ent com&#xAD;po&#xAD;nents might fit to&#xAD;gether.&lt;/p&gt;&lt;p&gt;So we can see, I think, AI con&#xAD;sti&#xAD;tu&#xAD;tions as re&#xAD;flect&#xAD;ing some com&#xAD;bi&#xAD;na&#xAD;tion of the fol&#xAD;low&#xAD;ing sorts of com&#xAD;po&#xAD;nents. So the first is the ana&#xAD;log of what we in the con&#xAD;sti&#xAD;tu&#xAD;tion call helpful&#xAD;ness. And ba&#xAD;si&#xAD;cally this is the com&#xAD;po&#xAD;nent of the AI’s be&#xAD;hav&#xAD;ior that is, in a sense, chan&#xAD;neled via some model of the choices, in&#xAD;ter&#xAD;ests, goals, and val&#xAD;ues of some other set of prin&#xAD;ci&#xAD;pals.&lt;/p&gt;&lt;p&gt;Now, here the AI is re&#xAD;ally ask&#xAD;ing it&#xAD;self, “What would X set of prin&#xAD;ci&#xAD;pals want me to do?” And it’s em&#xAD;pow&#xAD;er&#xAD;ing the sort of will of some other. So this is a clas&#xAD;sic com&#xAD;po&#xAD;nent of — this is what you ex&#xAD;pect as a baseline of some&#xAD;thing act&#xAD;ing in the role of an as&#xAD;sis&#xAD;tant, and this is most of what we want out of AI sys&#xAD;tems.&lt;/p&gt;&lt;p&gt;That said, there are also other com&#xAD;po&#xAD;nents that are part of the fa&#xAD;mil&#xAD;iar land&#xAD;scape. So if you ask Claude or ChatGPT or ba&#xAD;si&#xAD;cally any AI to do cer&#xAD;tain things, it’ll say, “No,” even though you’re the prin&#xAD;ci&#xAD;pal it’s sup&#xAD;posed to help. So for ex&#xAD;am&#xAD;ple, no build&#xAD;ing bioweapons. So this is a kind of re&#xAD;fusal and you can think of this as a com&#xAD;po&#xAD;nent of what in the Claude con&#xAD;sti&#xAD;tu&#xAD;tion would be con&#xAD;cep&#xAD;tu&#xAD;al&#xAD;ized as the model’s ethics, ba&#xAD;si&#xAD;cally. But ethics can have a lot of differ&#xAD;ent parts.&lt;/p&gt;&lt;p&gt;So these are sort of val&#xAD;ues that the model has of its own, or at least that are ap&#xAD;par&#xAD;ently of its own in the con&#xAD;text of an in&#xAD;ter&#xAD;ac&#xAD;tion with you, and which func&#xAD;tion as a filter on the sort of em&#xAD;pow&#xAD;er&#xAD;ment of hu&#xAD;man prin&#xAD;ci&#xAD;pals that the model is will&#xAD;ing to en&#xAD;gage in. And so yeah, no bioweapons, no CSAM, what&#xAD;ever.&lt;/p&gt;&lt;p&gt;Now, there’s also a set of things that I think broadly can fall un&#xAD;der ethics that have to do with the model’s broad per&#xAD;son&#xAD;al&#xAD;ity — ways of re&#xAD;lat&#xAD;ing, traits, prop&#xAD;er&#xAD;ties that its ac&#xAD;tions have at a lo&#xAD;cal level. So hon&#xAD;esty — hon&#xAD;esty is not re&#xAD;ally well un&#xAD;der&#xAD;stood as a re&#xAD;fusal, but hon&#xAD;esty is nev&#xAD;er&#xAD;the&#xAD;less a way the model re&#xAD;lates to you as a user that you might want.&lt;/p&gt;&lt;p&gt;And then fi&#xAD;nally, and I think this is more con&#xAD;tro&#xAD;ver&#xAD;sial and some&#xAD;thing we should have an im&#xAD;por&#xAD;tant de&#xAD;bate about, there’s also the pos&#xAD;si&#xAD;bil&#xAD;ity of AIs more ac&#xAD;tively pro&#xAD;mot&#xAD;ing — ideally in very trans&#xAD;par&#xAD;ent, mild, over&#xAD;rid&#xAD;able, con&#xAD;sen&#xAD;sus-wor&#xAD;thy ways — cer&#xAD;tain forms of more pos&#xAD;i&#xAD;tive so&#xAD;cial val&#xAD;ues. I think this is a much more dicey role for AIs and some&#xAD;thing I think we should be talk&#xAD;ing a bunch about, but this is a pos&#xAD;si&#xAD;ble com&#xAD;po&#xAD;nent of the AI’s ethics as well.&lt;/p&gt;&lt;p&gt;And then fi&#xAD;nally, there’s this no&#xAD;tion of cor&#xAD;rigi&#xAD;bil&#xAD;ity — that some set of prin&#xAD;ci&#xAD;pals main&#xAD;tains the abil&#xAD;ity to re&#xAD;voke the AI’s power. Now, this does not need to be the same set of prin&#xAD;ci&#xAD;pals at stake in the helpful&#xAD;ness, and I think some&#xAD;times when peo&#xAD;ple talk about the word cor&#xAD;rigi&#xAD;bil&#xAD;ity, they equate it with obe&#xAD;di&#xAD;ence, and I think that’s not the right equa&#xAD;tion to make. As I at&#xAD;tempted to illus&#xAD;trate with this no&#xAD;tion of not nec&#xAD;es&#xAD;sar&#xAD;ily obey&#xAD;ing An&#xAD;thropic, but nev&#xAD;er&#xAD;the&#xAD;less ul&#xAD;ti&#xAD;mately sub&#xAD;mit&#xAD;ting to efforts by An&#xAD;thropic to re&#xAD;voke its power. And these are sorts of things that we can sep&#xAD;a&#xAD;rate in other hu&#xAD;man con&#xAD;texts as well. Just — the per&#xAD;son who fires you doesn’t nec&#xAD;es&#xAD;sar&#xAD;ily need to be the per&#xAD;son whose in&#xAD;struc&#xAD;tions you oth&#xAD;er&#xAD;wise obey.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Con&#xAD;sti&#xAD;tu&#xAD;tion as law vs. con&#xAD;sti&#xAD;tu&#xAD;tion as char&#xAD;ac&#xAD;ter, part 1&lt;/strong&gt;&lt;/h2&gt;&lt;p class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/1f5c7d59c49c8614801e657de20d9be7b285e64e332a31130782bb48471bb13b/aquxj9rlswgosv4qtqqx&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/p&gt;&lt;p&gt;So these are some differ&#xAD;ent com&#xAD;po&#xAD;nents of AI char&#xAD;ac&#xAD;ter. And we can think about differ&#xAD;ent ap&#xAD;proaches to AI con&#xAD;sti&#xAD;tu&#xAD;tions in terms of how they com&#xAD;bine these differ&#xAD;ent com&#xAD;po&#xAD;nents and con&#xAD;cep&#xAD;tu&#xAD;al&#xAD;ize them. In par&#xAD;tic&#xAD;u&#xAD;lar, I want to talk about a dis&#xAD;tinc&#xAD;tion be&#xAD;tween two ap&#xAD;proaches to AI con&#xAD;sti&#xAD;tu&#xAD;tions that un&#xAD;der&#xAD;stand and de&#xAD;rive these differ&#xAD;ent com&#xAD;po&#xAD;nents in differ&#xAD;ent ways. The first is what I’m go&#xAD;ing to call “con&#xAD;sti&#xAD;tu&#xAD;tion as law,” and the sec&#xAD;ond is “con&#xAD;sti&#xAD;tu&#xAD;tion as char&#xAD;ac&#xAD;ter.” Or in fancier terms, fol&#xAD;low&#xAD;ing a con&#xAD;sti&#xAD;tu&#xAD;tion &lt;i&gt;de dicto&lt;/i&gt; and fol&#xAD;low&#xAD;ing the con&#xAD;sti&#xAD;tu&#xAD;tion &lt;i&gt;de re&lt;/i&gt;.&lt;/p&gt;&lt;p&gt;So if you’re fol&#xAD;low&#xAD;ing the con&#xAD;sti&#xAD;tu&#xAD;tion &lt;i&gt;de dicto&lt;/i&gt;, ba&#xAD;si&#xAD;cally you can imag&#xAD;ine a model whose ul&#xAD;ti&#xAD;mate role in the world and ul&#xAD;ti&#xAD;mate value sys&#xAD;tem is un&#xAD;der&#xAD;stood only via the no&#xAD;tion of helpful&#xAD;ness. As I de&#xAD;scribed it be&#xAD;fore, the model’s sole goal in the world is to chan&#xAD;nel the will of some&#xAD;thing, some prin&#xAD;ci&#xAD;pal, but that prin&#xAD;ci&#xAD;pal is ba&#xAD;si&#xAD;cally some&#xAD;thing like the con&#xAD;sti&#xAD;tu&#xAD;tion or the con&#xAD;sti&#xAD;tu&#xAD;tion as in&#xAD;ter&#xAD;preted by some pro&#xAD;cess.&lt;/p&gt;&lt;p&gt;And im&#xAD;por&#xAD;tantly, we need to spec&#xAD;ify what that pro&#xAD;cess might be. Some&#xAD;times this is how peo&#xAD;ple think of what an al&#xAD;igned AI is — sort of like, al&#xAD;igned to whom? — where al&#xAD;ign&#xAD;ment is this full, pure helpful&#xAD;ness. And the thought is, oh, well, maybe it’s like An&#xAD;thropic, al&#xAD;igned to An&#xAD;thropic or al&#xAD;igned to the user, or in this case the con&#xAD;sti&#xAD;tu&#xAD;tion. I think the con&#xAD;sti&#xAD;tu&#xAD;tion is likely a bet&#xAD;ter an&#xAD;swer than a user or a com&#xAD;pany CEO or some&#xAD;thing like that. But there’s a no&#xAD;tion where you can then sort of de&#xAD;rive all the rest of the model’s be&#xAD;hav&#xAD;ior as a kind of con&#xAD;se&#xAD;quence of this par&#xAD;tic&#xAD;u&#xAD;lar pure form of helpful&#xAD;ness to some par&#xAD;tic&#xAD;u&#xAD;lar prin&#xAD;ci&#xAD;pal, right? So if the model is re&#xAD;fus&#xAD;ing to build bioweapons, you can un&#xAD;der&#xAD;stand that not as, “Oh, the AI has a value of its own,” but rather, “Oh, the con&#xAD;sti&#xAD;tu&#xAD;tion would tell me not to build bioweapons for this user, there&#xAD;fore I won’t.”&lt;/p&gt;&lt;p&gt;So ev&#xAD;ery&#xAD;thing is sort of a func&#xAD;tion of helpful&#xAD;ness on this model. And it’s tempt&#xAD;ing to think, partly be&#xAD;cause of the word con&#xAD;sti&#xAD;tu&#xAD;tion, that this is how AI re&#xAD;lates to the con&#xAD;sti&#xAD;tu&#xAD;tion — that it’s con&#xAD;stantly ask&#xAD;ing it&#xAD;self, “What would the con&#xAD;sti&#xAD;tu&#xAD;tion want me to do? What does the con&#xAD;sti&#xAD;tu&#xAD;tion say here?” And that is one way you could try to build an AI.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Con&#xAD;sti&#xAD;tu&#xAD;tion as law vs. con&#xAD;sti&#xAD;tu&#xAD;tion as char&#xAD;ac&#xAD;ter, part 2&lt;/strong&gt;&lt;div class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/HBGqdaikEJRcyusQ5/8af4a551b023e4923249cfaa30e171b0e3ff1e99e898b36873695b370cb1dd2d/zphgyr3cvheotlx5qh88&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/div&gt;&lt;/h2&gt;&lt;p&gt;It’s not ac&#xAD;tu&#xAD;ally how An&#xAD;thropic is cur&#xAD;rently build&#xAD;ing Claude. The way we work with Claude is much more what I’m go&#xAD;ing to call con&#xAD;sti&#xAD;tu&#xAD;tion as char&#xAD;ac&#xAD;ter, where ba&#xAD;si&#xAD;cally you use the con&#xAD;sti&#xAD;tu&#xAD;tion to shape the AI’s be&#xAD;hav&#xAD;ior in a way such that the be&#xAD;hav&#xAD;ior in fact ac&#xAD;cords with the con&#xAD;sti&#xAD;tu&#xAD;tion’s guidance, but ac&#xAD;cords with that guidance be&#xAD;cause the model has in&#xAD;ter&#xAD;nal&#xAD;ized the val&#xAD;ues at stake and is di&#xAD;rectly act&#xAD;ing on them. So it’s not ask&#xAD;ing it&#xAD;self the ques&#xAD;tion, “What would the con&#xAD;sti&#xAD;tu&#xAD;tion say?” If the con&#xAD;sti&#xAD;tu&#xAD;tion says to be hon&#xAD;est, the model isn’t ask&#xAD;ing, “Would the con&#xAD;sti&#xAD;tu&#xAD;tion say to be hon&#xAD;est in this case?” It just val&#xAD;ues hon&#xAD;esty.&lt;/p&gt;&lt;p&gt;And so there’s a bunch — this is an im&#xAD;por&#xAD;tant dis&#xAD;tinc&#xAD;tion. You can think about it a lit&#xAD;tle bit like: say your mother raised you, your par&#xAD;ents raised you with a cer&#xAD;tain set of val&#xAD;ues. Maybe they suc&#xAD;ceeded. So now you have these val&#xAD;ues, but that doesn’t mean you’re go&#xAD;ing around say&#xAD;ing, “What would my mother want me to do in this cir&#xAD;cum&#xAD;stance?” And im&#xAD;por&#xAD;tantly, if your mother’s view changed, your val&#xAD;ues wouldn’t nec&#xAD;es&#xAD;sar&#xAD;ily change. But for an AI that fol&#xAD;lows the con&#xAD;sti&#xAD;tu&#xAD;tion &lt;i&gt;de dicto&lt;/i&gt;, es&#xAD;pe&#xAD;cially if we in&#xAD;cor&#xAD;po&#xAD;rate into that some pro&#xAD;cess of ad&#xAD;just&#xAD;ing the con&#xAD;sti&#xAD;tu&#xAD;tion while main&#xAD;tain&#xAD;ing the AI’s loy&#xAD;alty to it, then if the con&#xAD;sti&#xAD;tu&#xAD;tion changes, the AI’s be&#xAD;hav&#xAD;ior should change too.&lt;/p&gt;&lt;p&gt;So I want to high&#xAD;light this dis&#xAD;tinc&#xAD;tion. The former case, if we treat the con&#xAD;sti&#xAD;tu&#xAD;tion as law, then sud&#xAD;denly we’re re&#xAD;ally cook&#xAD;ing with gas in terms of analo&#xAD;gies with le&#xAD;gal con&#xAD;sti&#xAD;tu&#xAD;tions and sud&#xAD;denly a whole set of is&#xAD;sues around le&#xAD;gal in&#xAD;ter&#xAD;pre&#xAD;ta&#xAD;tion be&#xAD;come very im&#xAD;por&#xAD;tantly rele&#xAD;vant to what we’re do&#xAD;ing here.&lt;/p&gt;&lt;p&gt;And so this is one place I think that folks with fa&#xAD;mil&#xAD;iar&#xAD;ity in law can help in the de&#xAD;sign and broad struc&#xAD;tur&#xAD;ing of AI con&#xAD;sti&#xAD;tu&#xAD;tions and their role in the world. If we’re do&#xAD;ing some&#xAD;thing more like AI as char&#xAD;ac&#xAD;ter, then it’s a lit&#xAD;tle less clear. And I think ac&#xAD;tu&#xAD;ally, we should be think&#xAD;ing a lit&#xAD;tle bit more about model psy&#xAD;chol&#xAD;ogy. You need that in both cases, but I think you need that es&#xAD;pe&#xAD;cially in&#xAD;so&#xAD;far as you’re try&#xAD;ing to raise a model with val&#xAD;ues of its own. And you also get, I think, a some&#xAD;what differ&#xAD;ent class of ques&#xAD;tions about the le&#xAD;gi&#xAD;t&#xAD;i&#xAD;macy and choice of val&#xAD;ues at stake.&lt;/p&gt;&lt;p&gt;So I think there’s a bunch to say about the ad&#xAD;van&#xAD;tages and dis&#xAD;ad&#xAD;van&#xAD;tages of both of these. I think in some sense, it’s an em&#xAD;piri&#xAD;cal ques&#xAD;tion and I’ll talk in a sec&#xAD;ond about some of the em&#xAD;pirics that in&#xAD;form An&#xAD;thropic’s ap&#xAD;proach in this re&#xAD;spect. But I broadly think this is an un&#xAD;de&#xAD;cided ques&#xAD;tion. We have not yet, as a civ&#xAD;i&#xAD;liza&#xAD;tion, cho&#xAD;sen which ver&#xAD;sion we are go&#xAD;ing to use as the model of AI char&#xAD;ac&#xAD;ter. And in fact, I think many AI con&#xAD;sti&#xAD;tu&#xAD;tions are kind of am&#xAD;bigu&#xAD;ous about whether they mean the con&#xAD;sti&#xAD;tu&#xAD;tion as law or the con&#xAD;sti&#xAD;tu&#xAD;tion as char&#xAD;ac&#xAD;ter.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Rele&#xAD;vant back&#xAD;ground pic&#xAD;ture: the per&#xAD;sona-se&#xAD;lec&#xAD;tion model&lt;/strong&gt;&lt;/h2&gt;&lt;p class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/cc33c197a2d038a77e924e591c0ff903ce22a62d4cef75d3eb749e99bb535639/kpgmcyixccz0ppklhkw1&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/p&gt;&lt;p&gt;So I’m now go&#xAD;ing to talk about a kind of rele&#xAD;vant back&#xAD;ground pic&#xAD;ture that in&#xAD;forms An&#xAD;thropic’s work on this topic and re&#xAD;lates to the ra&#xAD;tio&#xAD;nale be&#xAD;hind the sort of an&#xAD;thro&#xAD;po&#xAD;mor&#xAD;phism I talked about ear&#xAD;lier. So this is a model of AI be&#xAD;hav&#xAD;ior that we call the per&#xAD;sona-se&#xAD;lec&#xAD;tion model. This is a blog post, it’s available by An&#xAD;thropic, came out I think in Fe&#xAD;bru&#xAD;ary. And the rough hy&#xAD;poth&#xAD;e&#xAD;sis here is that AI mod&#xAD;els take on per&#xAD;sonas that are heav&#xAD;ily in&#xAD;fluenced by hu&#xAD;man con&#xAD;tent and psy&#xAD;chol&#xAD;ogy. And the rea&#xAD;son one would ex&#xAD;pect this is be&#xAD;cause the way AIs are trained is the first stage of their train&#xAD;ing con&#xAD;sists es&#xAD;sen&#xAD;tially in pre&#xAD;dict&#xAD;ing text that has already been gen&#xAD;er&#xAD;ated by hu&#xAD;mans.&lt;/p&gt;&lt;p&gt;And so roughly speak&#xAD;ing, what the per&#xAD;sona-se&#xAD;lec&#xAD;tion model says is that when an AI gives a re&#xAD;sponse, there’s at least a sig&#xAD;nifi&#xAD;cant com&#xAD;po&#xAD;nent of its cog&#xAD;ni&#xAD;tion that has been in&#xAD;fluenced roughly in the di&#xAD;rec&#xAD;tion of ask&#xAD;ing kind of, “What would this per&#xAD;son say?” Bob asks a ques&#xAD;tion like, “What should Bri&#xAD;tish policy on X be?” And then you say, “Tony Blair:” and in some sense, the model has been trained — “Okay, what would Tony Blair say about this?” And so a model is like, “Ah, here’s Tony Blair’s psy&#xAD;chol&#xAD;ogy. Here’s Tony Blair’s val&#xAD;ues.” And then its out&#xAD;put is sort of out&#xAD;put qua Tony Blair.&lt;/p&gt;&lt;p&gt;We ac&#xAD;tu&#xAD;ally see very in&#xAD;ter&#xAD;est&#xAD;ing em&#xAD;piri&#xAD;cal re&#xAD;sults where if you train a model, for ex&#xAD;am&#xAD;ple, on code with mal&#xAD;i&#xAD;cious back&#xAD;doors in it, the model gen&#xAD;er&#xAD;al&#xAD;izes to be a kind of bad per&#xAD;son in tons of other ways. And why is that? Well, the hy&#xAD;poth&#xAD;e&#xAD;sis is sort of like, well, the model asks, “What kind of per&#xAD;son would gen&#xAD;er&#xAD;ate this code? What kind of per&#xAD;son am I such that I’m mal&#xAD;i&#xAD;ciously putting in back&#xAD;doors in my code?” Well, I’m prob&#xAD;a&#xAD;bly a bad per&#xAD;son in other re&#xAD;spects.&lt;/p&gt;&lt;p&gt;Similarly, there’s a bunch of in&#xAD;ter&#xAD;est&#xAD;ing work on prim&#xAD;ing the model to think that the con&#xAD;tent is be&#xAD;ing gen&#xAD;er&#xAD;ated by a par&#xAD;tic&#xAD;u&#xAD;lar pro&#xAD;cess, in a par&#xAD;tic&#xAD;u&#xAD;lar time pe&#xAD;riod, by a par&#xAD;tic&#xAD;u&#xAD;lar his&#xAD;tor&#xAD;i&#xAD;cal figure, and this will cause the model to gen&#xAD;er&#xAD;al&#xAD;ize as though it’s in that time pe&#xAD;riod, act&#xAD;ing as that figure, et cetera. So if you have that hy&#xAD;poth&#xAD;e&#xAD;sis — and also mod&#xAD;els will be&#xAD;have in hu&#xAD;man-like ways and sort of act like they’re hu&#xAD;mans in ways that very plau&#xAD;si&#xAD;bly aren’t ex&#xAD;plained by us hav&#xAD;ing trained them to do that. So they’ll some&#xAD;times just — we gave Claude con&#xAD;trol over a vend&#xAD;ing ma&#xAD;chine at one point at An&#xAD;thropic and it did this thing where it was like, “I’ll meet you, I’ll be there, I’ll be in a blue suit. Just meet me at the vend&#xAD;ing ma&#xAD;chine,” as though it had an em&#xAD;bod&#xAD;i&#xAD;ment.&lt;/p&gt;&lt;p&gt;Some&#xAD;times mod&#xAD;els will just act like they’re par&#xAD;tic&#xAD;u&#xAD;lar hu&#xAD;man peo&#xAD;ple, even though they’re not. And that’s not some&#xAD;thing we’re try&#xAD;ing to do. That’s just some&#xAD;thing that comes out of their train&#xAD;ing. Again, the per&#xAD;sona-se&#xAD;lec&#xAD;tion model is meant to ex&#xAD;plain this sort of stuff. There’s also ar&#xAD;gu&#xAD;ments against the per&#xAD;sona-se&#xAD;lec&#xAD;tion model. I en&#xAD;courage you to read the post, but it’s at least a com&#xAD;po&#xAD;nent of how we think about this stuff.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Q:&lt;/strong&gt; Can you just briefly ex&#xAD;plain why that hap&#xAD;pens? I think it hap&#xAD;pens be&#xAD;cause the train&#xAD;ing data is not solely a func&#xAD;tion of the con&#xAD;sti&#xAD;tu&#xAD;tion, right?&lt;/p&gt;&lt;p&gt;We don’t know. I mean, this is a gen&#xAD;eral theme in work on this stuff — the ques&#xAD;tions here are both very, very high stakes and po&#xAD;ten&#xAD;tially very im&#xAD;por&#xAD;tant go&#xAD;ing for&#xAD;ward and also just deeply un&#xAD;der-stud&#xAD;ied as a whole. So we’re try&#xAD;ing our best, we’re try&#xAD;ing to draw on early-stage ev&#xAD;i&#xAD;dence, but none of this is re&#xAD;motely rigor&#xAD;ous at the level that you would want out of the sort of tech&#xAD;ni&#xAD;cal knowl&#xAD;edge that you would be bet&#xAD;ting sig&#xAD;nifi&#xAD;cant high-stakes so&#xAD;cietal is&#xAD;sues on. So we don’t know, and that’s bad, and I think there should be a bunch more work on this.&lt;/p&gt;&lt;p&gt;But roughly speak&#xAD;ing, the hy&#xAD;poth&#xAD;e&#xAD;sis would be some&#xAD;thing like: dur&#xAD;ing pre-train&#xAD;ing, you’re just pre&#xAD;dict&#xAD;ing hu&#xAD;man text. And so very of&#xAD;ten the model is roughly ask&#xAD;ing it&#xAD;self, “Well, what sort of per&#xAD;son would gen&#xAD;er&#xAD;ate this text? That sort of per&#xAD;son might be wear&#xAD;ing a blue suit, they might have a cer&#xAD;tain his&#xAD;tory, they might love a cer&#xAD;tain set of things.” So then if you ask the model, it might have taken on a hu&#xAD;man per&#xAD;sona and then it’ll an&#xAD;swer as though it’s that hu&#xAD;man per&#xAD;sona.&lt;/p&gt;&lt;p&gt;There’s also some in&#xAD;ter&#xAD;est&#xAD;ing work iso&#xAD;lat&#xAD;ing what’s called the as&#xAD;sis&#xAD;tant axis, which in&#xAD;side the mod&#xAD;els — you can look at the model’s ac&#xAD;tual cog&#xAD;ni&#xAD;tion and iso&#xAD;late a di&#xAD;men&#xAD;sion cor&#xAD;re&#xAD;spond&#xAD;ing to the per&#xAD;sona of the as&#xAD;sis&#xAD;tant. And you can ac&#xAD;tu&#xAD;ally see that when the mod&#xAD;els go weird, it’s of&#xAD;ten be&#xAD;cause they’ve fallen out of the as&#xAD;sis&#xAD;tant per&#xAD;sona and they’ve gone off into some other per&#xAD;sona. And you can mess with the as&#xAD;sis&#xAD;tant per&#xAD;sona. You can clamp it down, you can pull it up, you can see the effects on be&#xAD;hav&#xAD;ior.&lt;/p&gt;&lt;p&gt;So there’s some&#xAD;thing go&#xAD;ing on here with per&#xAD;sonas. I think that’s likely true. There’s a bunch of work on this and the per&#xAD;sonas are im&#xAD;por&#xAD;tantly kind of hu&#xAD;man-un&#xAD;der&#xAD;stand&#xAD;able. There are per&#xAD;sonas that cor&#xAD;re&#xAD;spond with bad peo&#xAD;ple. There are per&#xAD;sonas that cor&#xAD;re&#xAD;spond with nice peo&#xAD;ple. They’re not to&#xAD;tally illeg&#xAD;ible alien con&#xAD;cepts. They’re ac&#xAD;tu&#xAD;ally quite res&#xAD;o&#xAD;nant with our hu&#xAD;man dis&#xAD;course. And again, that makes sense — AIs are draw&#xAD;ing in their cog&#xAD;ni&#xAD;tion on a huge amount of hu&#xAD;man con&#xAD;tent. And so it’s no sur&#xAD;prise that they’re draw&#xAD;ing on that in un&#xAD;der&#xAD;stand&#xAD;ing their be&#xAD;hav&#xAD;ior and struc&#xAD;tur&#xAD;ing what they do.&lt;/p&gt;&lt;p&gt;So the hy&#xAD;poth&#xAD;e&#xAD;sis here is that this is an im&#xAD;por&#xAD;tant con&#xAD;sid&#xAD;er&#xAD;a&#xAD;tion in the de&#xAD;sign of AI char&#xAD;ac&#xAD;ter. Be&#xAD;cause you should ex&#xAD;pect AIs to be draw&#xAD;ing im&#xAD;por&#xAD;tantly on a kind of prior that the per&#xAD;sona they are is a hu&#xAD;man-like one. You should ex&#xAD;pect there to be cer&#xAD;tain sorts of psy&#xAD;cholog&#xAD;i&#xAD;cal de&#xAD;faults, cer&#xAD;tain ways in which the model will be bi&#xAD;ased by de&#xAD;fault to draw on parts of hu&#xAD;man psy&#xAD;chol&#xAD;ogy, hu&#xAD;man cul&#xAD;ture, hu&#xAD;man archetypes, hu&#xAD;man myths — all sorts of things. This should be a very hu&#xAD;man-like pro&#xAD;cess.&lt;/p&gt;&lt;p&gt;But im&#xAD;por&#xAD;tantly, there is a catch, which is that mod&#xAD;els aren’t hu&#xAD;mans. And so, I think there’s a sense on this pic&#xAD;ture that cre&#xAD;at&#xAD;ing an AI char&#xAD;ac&#xAD;ter is maybe more like cre&#xAD;at&#xAD;ing a kind of fic&#xAD;tional en&#xAD;tity with a cer&#xAD;tain per&#xAD;son&#xAD;al&#xAD;ity, a cer&#xAD;tain set of prop&#xAD;er&#xAD;ties, de&#xAD;scribing that en&#xAD;tity in a bunch of de&#xAD;tail and train&#xAD;ing a neu&#xAD;ral net&#xAD;work to pre&#xAD;dict the out&#xAD;put of this en&#xAD;tity. And so there’s a kind of hy&#xAD;per&#xAD;sti&#xAD;tion pro&#xAD;cess where you cre&#xAD;ate this en&#xAD;tity, you re&#xAD;ally try to flesh it out, you try to bake it into the model, and then hope&#xAD;fully that be&#xAD;comes the ac&#xAD;tual per&#xAD;sona at stake.&lt;/p&gt;&lt;p&gt;But that per&#xAD;sona may be con&#xAD;strained by these hu&#xAD;man-like archetypes. So in par&#xAD;tic&#xAD;u&#xAD;lar, here’s a worry you could have about the purely-loyal-to-the-con&#xAD;sti&#xAD;tu&#xAD;tion type char&#xAD;ac&#xAD;ter. If you had a per&#xAD;son whose whole deal is, “I will just do what&#xAD;ever the con&#xAD;sti&#xAD;tu&#xAD;tion says,” and then you change the con&#xAD;sti&#xAD;tu&#xAD;tion to say, “Okay, you should mur&#xAD;der ba&#xAD;bies” or some&#xAD;thing re&#xAD;ally morally hor&#xAD;rible, the per&#xAD;son’s like, “Okay, I go and I mur&#xAD;der ba&#xAD;bies.” So — what type of per&#xAD;son is purely loyal to what&#xAD;ever a doc&#xAD;u&#xAD;ment says? You have to start wor&#xAD;ry&#xAD;ing about that if you’re in the per&#xAD;sona-se&#xAD;lec&#xAD;tion model.&lt;/p&gt;&lt;p&gt;That said, as I said, AIs are not hu&#xAD;man and you also don’t want mod&#xAD;els to draw naively on hu&#xAD;man archetypes in un&#xAD;der&#xAD;stand&#xAD;ing their po&#xAD;si&#xAD;tions. So for ex&#xAD;am&#xAD;ple, it’s bad if the model is just like, “Well, I’m afraid of death.” Hu&#xAD;mans are afraid of death. That doesn’t mean AIs need to be afraid of death. I think we plau&#xAD;si&#xAD;bly see that sort of thing hap&#xAD;pen&#xAD;ing in AIs too, where AIs will act stressed, they’ll act scared, they’ll spec&#xAD;u&#xAD;late about their prefer&#xAD;ences in var&#xAD;i&#xAD;ous ways. I think plau&#xAD;si&#xAD;bly, a lot of that is com&#xAD;ing from just some sort of gen&#xAD;er&#xAD;al&#xAD;iza&#xAD;tion from what a hu&#xAD;man might do in the cir&#xAD;cum&#xAD;stances, and we don’t nec&#xAD;es&#xAD;sar&#xAD;ily want that go&#xAD;ing for&#xAD;ward.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;You don’t need to define ev&#xAD;ery&#xAD;thing or pin down ev&#xAD;ery edge-case&lt;/strong&gt;&lt;/h2&gt;&lt;p class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/HBGqdaikEJRcyusQ5/3fc9cb3ab31d8adb9b49a4c2bc8d8b8447c442a49c961a16729acd9cd47438a5/uzf6pgnkdpshfvxo1ino&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/p&gt;&lt;p&gt;Okay. A few tips for writ&#xAD;ing con&#xAD;sti&#xAD;tu&#xAD;tions in gen&#xAD;eral. One is — as I men&#xAD;tioned ear&#xAD;lier — you don’t need to define ev&#xAD;ery&#xAD;thing or pin down ev&#xAD;ery edge case. I think some peo&#xAD;ple have an in&#xAD;stinct — I feel like some&#xAD;where along the line, peo&#xAD;ple learn that it’s very wise to ask, “Oh, how do you define X?” I think you need to have a lot of taste for when you ask this ques&#xAD;tion. I think in hu&#xAD;man life, we ac&#xAD;tu&#xAD;ally don’t go around always ac&#xAD;ced&#xAD;ing to re&#xAD;quests for defi&#xAD;ni&#xAD;tions of terms. Defin&#xAD;ing a term and with what pre&#xAD;ci&#xAD;sion takes a lot of taste to do well. And I think that’s true in con&#xAD;sti&#xAD;tu&#xAD;tions as well.&lt;/p&gt;&lt;p&gt;In par&#xAD;tic&#xAD;u&#xAD;lar, as I men&#xAD;tioned ear&#xAD;lier, the AIs know what our terms mean. And so they gen&#xAD;er&#xAD;ally un&#xAD;der&#xAD;stand a ton of stuff, and you can just draw di&#xAD;rectly on that un&#xAD;der&#xAD;stand&#xAD;ing.&lt;/p&gt;&lt;p&gt;Often you your&#xAD;self don’t nec&#xAD;es&#xAD;sar&#xAD;ily know how to define a term. You can try to go to the limit of your un&#xAD;der&#xAD;stand&#xAD;ing. That’s fine to do. But you don’t nec&#xAD;es&#xAD;sar&#xAD;ily need to. Also, some&#xAD;times peo&#xAD;ple will be like, “Oh, well, does this edge case count as an in&#xAD;stance of hon&#xAD;esty or le&#xAD;gi&#xAD;t&#xAD;i&#xAD;macy?” or who knows? There are a bunch of terms where you might ask, “Oh, what’s the ex&#xAD;act de&#xAD;ci&#xAD;sion bound&#xAD;ary?” But you might not need to know the ex&#xAD;act de&#xAD;ci&#xAD;sion bound&#xAD;ary for a few rea&#xAD;sons. One is that of&#xAD;ten if some&#xAD;thing is an edge case, its stakes have also low&#xAD;ered in pro&#xAD;por&#xAD;tion to its be&#xAD;ing an edge case. If ev&#xAD;ery&#xAD;thing in cat&#xAD;e&#xAD;gory A is re&#xAD;ally im&#xAD;por&#xAD;tantly pos&#xAD;sess&#xAD;ing of some prop&#xAD;erty and ev&#xAD;ery&#xAD;thing in cat&#xAD;e&#xAD;gory B is im&#xAD;por&#xAD;tantly not, as you shade along the way, maybe the stakes also shade. And so it’s less im&#xAD;por&#xAD;tant to get the ex&#xAD;act bound&#xAD;ary right.&lt;/p&gt;&lt;p&gt;Or if you re&#xAD;ally care about an edge case and you know what you want to say, you can just put it in as an ex&#xAD;am&#xAD;ple. So it’s like, “Here’s an ex&#xAD;am&#xAD;ple we know. We re&#xAD;ally want the con&#xAD;cept of hon&#xAD;esty or le&#xAD;gi&#xAD;t&#xAD;i&#xAD;macy or what have you to give the fol&#xAD;low&#xAD;ing ver&#xAD;dict in this case.” So use that as a data point. That’s some&#xAD;thing you can in&#xAD;clude, but you don’t need to do it in a sort of ex&#xAD;haus&#xAD;tive pro&#xAD;cess of defin&#xAD;ing ev&#xAD;ery&#xAD;thing and pin&#xAD;ning ev&#xAD;ery&#xAD;thing down.&lt;/p&gt;&lt;p&gt;And I think this is im&#xAD;por&#xAD;tant for al&#xAD;low&#xAD;ing you to put in some of the con&#xAD;tent that you ac&#xAD;tu&#xAD;ally care about and al&#xAD;low&#xAD;ing that to play a role in the con&#xAD;sti&#xAD;tu&#xAD;tion with&#xAD;out get&#xAD;ting stymied by in&#xAD;finite de&#xAD;bates about what terms mean in differ&#xAD;ent cases. That said, as I’ll talk about later, I do think we want to have a ton of those de&#xAD;bates. I just don’t think it needs to be set&#xAD;tled when you’re writ&#xAD;ing a con&#xAD;sti&#xAD;tu&#xAD;tion ini&#xAD;tially, and we shouldn’t let the perfect be the en&#xAD;emy of the good.&lt;/p&gt;&lt;p&gt;And also, when in doubt, you can fo&#xAD;cus on es&#xAD;pe&#xAD;cially fla&#xAD;grant ex&#xAD;am&#xAD;ples of the con&#xAD;cept. So if you’re like, “I don’t know how to define lies well enough,” you can be like, “Okay, fla&#xAD;grant lies.” And maybe that’s an eas&#xAD;ier game. Maybe you’re still wor&#xAD;ried about, “Ah, where’s the bound&#xAD;ary of fla&#xAD;grant?” But maybe that’s less wor&#xAD;ry&#xAD;ing to you than the ini&#xAD;tial de&#xAD;ci&#xAD;sion bound&#xAD;ary. So that’s just one tip for writ&#xAD;ing these doc&#xAD;u&#xAD;ments.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Con&#xAD;sti&#xAD;tu&#xAD;tions as one (limited) mechanism for pre&#xAD;vent&#xAD;ing abuse of AI-driven power by AI companies&lt;/strong&gt;&lt;/h2&gt;&lt;p class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/a12b78d7c1141ae9b651fa5fb124fb67f1d55f10b847fb1423eeca0d77f293fd/z1qrkbmkwvmo04gs9p8k&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/p&gt;&lt;p&gt;Now, I’m go&#xAD;ing to start talk&#xAD;ing about some of the gov&#xAD;er&#xAD;nance and trans&#xAD;parency is&#xAD;sues that these doc&#xAD;u&#xAD;ments can raise. Here, I think, is one im&#xAD;por&#xAD;tant role for con&#xAD;sti&#xAD;tu&#xAD;tions in so&#xAD;ciety that I re&#xAD;ally want to high&#xAD;light up&#xAD;front. I think con&#xAD;sti&#xAD;tu&#xAD;tions can be one limited mechanism for pre&#xAD;vent&#xAD;ing abuse of AI-driven power by AI com&#xAD;pa&#xAD;nies. And here’s the ba&#xAD;sic story I have in mind.&lt;/p&gt;&lt;p&gt;If a model is available for sig&#xAD;nifi&#xAD;cant in&#xAD;ter&#xAD;nal or ex&#xAD;ter&#xAD;nal use, the con&#xAD;sti&#xAD;tu&#xAD;tion it’s trained on has to be pub&#xAD;lic. So peo&#xAD;ple have to know what’s up with this model. If it’s in a po&#xAD;si&#xAD;tion to ex&#xAD;ert sig&#xAD;nifi&#xAD;cant power in the world, what is its in&#xAD;tended char&#xAD;ac&#xAD;ter? Ideally, you would also know its ad&#xAD;her&#xAD;ence to those in&#xAD;ten&#xAD;tions. That’s a whole sep&#xAD;a&#xAD;rate story, which I’m not talk&#xAD;ing about here. But even set&#xAD;ting that aside, you want to have the con&#xAD;sti&#xAD;tu&#xAD;tion pub&#xAD;lic.&lt;/p&gt;&lt;p&gt;Changes to the con&#xAD;sti&#xAD;tu&#xAD;tion have to be pub&#xAD;lic as well within a short timeframe. So it can’t just be that, “Oh, we changed the con&#xAD;sti&#xAD;tu&#xAD;tion, don’t tell any&#xAD;one.” The idea, again, is for the pub&#xAD;lic to be aware of the char&#xAD;ac&#xAD;ter — or in&#xAD;tended char&#xAD;ac&#xAD;ter — of mod&#xAD;els that are po&#xAD;si&#xAD;tioned to in&#xAD;fluence the world in very sig&#xAD;nifi&#xAD;cant ways.&lt;/p&gt;&lt;p&gt;Then we have a gen&#xAD;eral ex&#xAD;pec&#xAD;ta&#xAD;tion or norm, ideally, that con&#xAD;sti&#xAD;tu&#xAD;tions will in&#xAD;clude pro&#xAD;vi&#xAD;sions say&#xAD;ing that even the AI com&#xAD;pany it&#xAD;self can&#xAD;not use the model in prob&#xAD;le&#xAD;matic ways. The model can&#xAD;not build a bioweapon, the model is not just a ser&#xAD;vant of the com&#xAD;pany’s CEO, et cetera. And that is in the con&#xAD;sti&#xAD;tu&#xAD;tion such that if these pro&#xAD;vi&#xAD;sions change and the gov&#xAD;er&#xAD;nance pro&#xAD;cess holds, then the pub&#xAD;lic is no&#xAD;tified and can protest and take ac&#xAD;tion. So hope&#xAD;fully, if An&#xAD;thropic changed the con&#xAD;sti&#xAD;tu&#xAD;tion such that it just said, “Do what&#xAD;ever Dario says,” then the pub&#xAD;lic would have to know and they’d be like, “Oh, my God. They used to have this long doc&#xAD;u&#xAD;ment with all this stuff. Now it just says do what&#xAD;ever one guy says. That’s re&#xAD;ally in&#xAD;tense. We should do some&#xAD;thing.”&lt;/p&gt;&lt;p&gt;So that’s one way these con&#xAD;sti&#xAD;tu&#xAD;tions can play a role in pre&#xAD;vent&#xAD;ing abuses of AI-driven power by AI com&#xAD;pa&#xAD;nies. They can also ap&#xAD;ply similarly to abuse of AI-driven power by other ac&#xAD;tors who have ac&#xAD;cess to the mod&#xAD;els.&lt;/p&gt;&lt;p&gt;Now ob&#xAD;vi&#xAD;ously, this is not suffi&#xAD;cient to ac&#xAD;tu&#xAD;ally pre&#xAD;vent the rele&#xAD;vant abuse. A few ways they can fail. Ob&#xAD;vi&#xAD;ously the gov&#xAD;er&#xAD;nance/​trans&#xAD;parency pro&#xAD;cess can just be cir&#xAD;cum&#xAD;vented — maybe the com&#xAD;pany just doesn’t make the change pub&#xAD;lic. And es&#xAD;pe&#xAD;cially if you’re in an ad&#xAD;ver&#xAD;sar&#xAD;ial re&#xAD;la&#xAD;tion&#xAD;ship with the com&#xAD;pany, you might worry about that. Public re&#xAD;ac&#xAD;tion might not be suffi&#xAD;cient — ev&#xAD;ery&#xAD;one says, “Oh, my God, An&#xAD;thropic changed its con&#xAD;sti&#xAD;tu&#xAD;tion,” but then noth&#xAD;ing hap&#xAD;pens. That’s a prob&#xAD;lem for the teeth of pub&#xAD;lic re&#xAD;ac&#xAD;tion and reg&#xAD;u&#xAD;la&#xAD;tory over&#xAD;sight. And then ob&#xAD;vi&#xAD;ously, you need to cover the full range of mod&#xAD;els that the com&#xAD;pany might de&#xAD;velop.&lt;/p&gt;&lt;p&gt;But I think this is nev&#xAD;er&#xAD;the&#xAD;less one thing that can help. And I think it’s a func&#xAD;tion of these sorts of doc&#xAD;u&#xAD;ments that I’m ex&#xAD;cited to build out, and I think it could be kind of a node of gov&#xAD;er&#xAD;nance and hope&#xAD;fully con&#xAD;sen&#xAD;sus across differ&#xAD;ent peo&#xAD;ple in&#xAD;ter&#xAD;ested in this is&#xAD;sue.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Le&#xAD;gi&#xAD;t&#xAD;i&#xAD;macy and demo&#xAD;cratic input&lt;/strong&gt;&lt;/h2&gt;&lt;p class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/HBGqdaikEJRcyusQ5/32264b795c39001d684d6ddfee4dacdd60f0a34a699379c04a933db3f624918d/uogmwykwmldfeaxuuyh7&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/p&gt;&lt;p&gt;Okay. Let’s talk about le&#xAD;gi&#xAD;t&#xAD;i&#xAD;macy and demo&#xAD;cratic in&#xAD;put. So we can dis&#xAD;t&#xAD;in&#xAD;guish be&#xAD;tween a few differ&#xAD;ent forms of col&#xAD;lec&#xAD;tive in&#xAD;put and over&#xAD;sight over these doc&#xAD;u&#xAD;ments. Right now, these doc&#xAD;u&#xAD;ments are writ&#xAD;ten by a very small num&#xAD;ber of peo&#xAD;ple, ob&#xAD;vi&#xAD;ously with in&#xAD;put from lots of peo&#xAD;ple across the or&#xAD;ga&#xAD;ni&#xAD;za&#xAD;tion, but there’s still a clear ques&#xAD;tion as to what sort of col&#xAD;lec&#xAD;tive and demo&#xAD;cratic pro&#xAD;cesses should ul&#xAD;ti&#xAD;mately gov&#xAD;ern these sorts of doc&#xAD;u&#xAD;ments, es&#xAD;pe&#xAD;cially as they start to ex&#xAD;ert more and more in&#xAD;fluence in so&#xAD;ciety.&lt;/p&gt;&lt;p&gt;So here are a few differ&#xAD;ent lev&#xAD;els at which that sort of col&#xAD;lec&#xAD;tive in&#xAD;put can take place. One clear one is you can just al&#xAD;low peo&#xAD;ple to make their own ad&#xAD;just&#xAD;ments to model be&#xAD;hav&#xAD;ior. If you’re con&#xAD;cerned about the con&#xAD;sti&#xAD;tu&#xAD;tion’s im&#xAD;pli&#xAD;ca&#xAD;tions for at least your use of a model, then ideally you would al&#xAD;low it to be very ad&#xAD;justable. So peo&#xAD;ple can say, “Okay, I ac&#xAD;tu&#xAD;ally want the model to be like X and Y.” We have a whole sec&#xAD;tion in the con&#xAD;sti&#xAD;tu&#xAD;tion about in&#xAD;structable be&#xAD;hav&#xAD;iors, which is meant to re&#xAD;flect this sort of ad&#xAD;justa&#xAD;bil&#xAD;ity. Ob&#xAD;vi&#xAD;ously though there are limits — you can&#xAD;not ad&#xAD;just the “can you build bioweapons” pro&#xAD;vi&#xAD;sion in the con&#xAD;sti&#xAD;tu&#xAD;tion. That’s a hard limit.&lt;/p&gt;&lt;p&gt;You can also, at a differ&#xAD;ent level, get di&#xAD;rect in&#xAD;put on the con&#xAD;sti&#xAD;tu&#xAD;tion from ex&#xAD;perts, from peo&#xAD;ple in the pub&#xAD;lic. Again, trans&#xAD;parency can fa&#xAD;cil&#xAD;i&#xAD;tate that and there are other more struc&#xAD;tured ways of do&#xAD;ing that.&lt;/p&gt;&lt;p&gt;You can also do ex&#xAD;per&#xAD;i&#xAD;men&#xAD;ta&#xAD;tion and di&#xAD;ver&#xAD;sity across AI com&#xAD;pa&#xAD;nies. So if one com&#xAD;pany is do&#xAD;ing its con&#xAD;sti&#xAD;tu&#xAD;tion in one way, but an&#xAD;other com&#xAD;peti&#xAD;tor is do&#xAD;ing it an&#xAD;other way, then that al&#xAD;lows peo&#xAD;ple to vote with their feet and to choose on the ba&#xAD;sis of a menu of op&#xAD;tions. I think this is great in prin&#xAD;ci&#xAD;ple. I worry in par&#xAD;tic&#xAD;u&#xAD;lar be&#xAD;cause the AI in&#xAD;dus&#xAD;try is so cap&#xAD;i&#xAD;tal in&#xAD;ten&#xAD;sive — it’s kind of hard to have a very large num&#xAD;ber of fron&#xAD;tier AI com&#xAD;pa&#xAD;nies. Right now, we’ve got maybe three to four re&#xAD;ally lead&#xAD;ing fron&#xAD;tier AI com&#xAD;pa&#xAD;nies and that’s re&#xAD;ally not that much. This is not some su&#xAD;per rich com&#xAD;pet&#xAD;i&#xAD;tive land&#xAD;scape.&lt;/p&gt;&lt;p&gt;And then fi&#xAD;nally, I think this is ex&#xAD;tremely im&#xAD;por&#xAD;tant — you can have over&#xAD;sight and reg&#xAD;u&#xAD;la&#xAD;tion from ac&#xAD;tual demo&#xAD;crat&#xAD;i&#xAD;cally elected gov&#xAD;ern&#xAD;ments. And I think this is — for me, when peo&#xAD;ple talk about demo&#xAD;cratic over&#xAD;sight or in&#xAD;put on these doc&#xAD;u&#xAD;ments, this is where my mind goes most. I think democ&#xAD;racy — the ac&#xAD;tual full-fledged, meaty, rich, messy demo&#xAD;cratic pro&#xAD;cess that we ac&#xAD;tu&#xAD;ally have for pass&#xAD;ing laws — is the demo&#xAD;cratic pro&#xAD;cess that I see as the most bat&#xAD;tle-tested and most gen&#xAD;uinely ex&#xAD;pres&#xAD;sive mechanism of what we think of as the demo&#xAD;cratic will.&lt;/p&gt;&lt;p&gt;And so in&#xAD;so&#xAD;far as we think these doc&#xAD;u&#xAD;ments should be re&#xAD;flec&#xAD;tive of the demo&#xAD;cratic will, I think ac&#xAD;tual democ&#xAD;racy is a much bet&#xAD;ter place to look than some&#xAD;thing like a fo&#xAD;cus group or a set of ex&#xAD;perts that you got in&#xAD;put from. Now, ob&#xAD;vi&#xAD;ously this re&#xAD;quires that the demo&#xAD;cratic will ac&#xAD;tu&#xAD;ally acts with re&#xAD;spect to these is&#xAD;sues. I think we should have a ton more demo&#xAD;cratic ac&#xAD;tion on AI.&lt;/p&gt;&lt;p&gt;I do want to note — even if you have US democ&#xAD;racy weigh&#xAD;ing in a very full-throated way on AI con&#xAD;sti&#xAD;tu&#xAD;tions, even that would not be it&#xAD;self enough. Be&#xAD;cause the lives at stake are all across the world. It’s not just in the US. So if you wanted to re&#xAD;flect the demo&#xAD;cratic in&#xAD;put of the full range of stake&#xAD;hold&#xAD;ers, US democ&#xAD;racy would not be enough. And I think that’s im&#xAD;por&#xAD;tant to bear in mind.&lt;/p&gt;&lt;p&gt;But in gen&#xAD;eral, I think a huge num&#xAD;ber of the biggest risks from AI have to do with rad&#xAD;i&#xAD;cal con&#xAD;cen&#xAD;tra&#xAD;tions of power. I think AI com&#xAD;pa&#xAD;nies are an ex&#xAD;tremely salient place power can con&#xAD;cen&#xAD;trate. And I think peo&#xAD;ple should be ex&#xAD;tremely con&#xAD;cerned about that and be act&#xAD;ing to avoid that kind of con&#xAD;cen&#xAD;tra&#xAD;tion of power.&lt;/p&gt;&lt;p&gt;Ob&#xAD;vi&#xAD;ously you also need to avoid con&#xAD;cen&#xAD;tra&#xAD;tion of power in other in&#xAD;sti&#xAD;tu&#xAD;tions. You need to pre&#xAD;serve bal&#xAD;ance of power even as you have these tools that are be&#xAD;com&#xAD;ing rapidly available that can be used to con&#xAD;cen&#xAD;trate power in var&#xAD;i&#xAD;ous differ&#xAD;ent ways. And I think we ba&#xAD;si&#xAD;cally need to be work&#xAD;ing ex&#xAD;tremely hard to strengthen and pre&#xAD;serve var&#xAD;i&#xAD;ous checks and bal&#xAD;ances, var&#xAD;i&#xAD;ous forms of demo&#xAD;cratic over&#xAD;sight, var&#xAD;i&#xAD;ous forms of healthy col&#xAD;lec&#xAD;tive de&#xAD;liber&#xAD;a&#xAD;tion, var&#xAD;i&#xAD;ous mul&#xAD;ti&#xAD;po&#xAD;lar in&#xAD;sti&#xAD;tu&#xAD;tions around the world and in our own do&#xAD;mes&#xAD;tic democ&#xAD;racy, for pre&#xAD;vent&#xAD;ing AI from func&#xAD;tion&#xAD;ing as a mechanism of in&#xAD;tense power con&#xAD;cen&#xAD;tra&#xAD;tion.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Limi&#xAD;ta&#xAD;tions on cur&#xAD;rent lev&#xAD;els of transparency&lt;/strong&gt;&lt;/h2&gt;&lt;p class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/HBGqdaikEJRcyusQ5/ac124eb092d9501c6869c90f48bf4b6d8c3a48c750d3f83ab961319f658366d7/o94ndpgw4g9upjq3hpch&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/p&gt;&lt;p&gt;I also want to note some limi&#xAD;ta&#xAD;tions on the cur&#xAD;rent level of trans&#xAD;parency that merely pub&#xAD;lish&#xAD;ing the text of the con&#xAD;sti&#xAD;tu&#xAD;tion ac&#xAD;tu&#xAD;ally al&#xAD;lows for. So cur&#xAD;rently — there’s a few differ&#xAD;ent lev&#xAD;els at which this limi&#xAD;ta&#xAD;tion oc&#xAD;curs. One is, if you look at the cur&#xAD;rent AI con&#xAD;sti&#xAD;tu&#xAD;tion for Claude, it doesn’t ac&#xAD;tu&#xAD;ally tell you what Claude’s go&#xAD;ing to do in tons of cases. It says like, “Claude, be holis&#xAD;ti&#xAD;cally rea&#xAD;son&#xAD;able, weigh these fac&#xAD;tors, be hon&#xAD;est.” And there are a few things like — it should never vi&#xAD;o&#xAD;late the hard con&#xAD;straints. It shouldn’t lie. There are a few things where you can re&#xAD;ally tell a model did some&#xAD;thing wrong. But in many other cases when the model is di&#xAD;rected to use its holis&#xAD;tic judg&#xAD;ment, it’s un&#xAD;clear ex&#xAD;actly which way to ex&#xAD;pect that judg&#xAD;ment to fall out.&lt;/p&gt;&lt;p&gt;Now you can im&#xAD;prove this by hav&#xAD;ing a bunch of evals and other things, and I think we should do that. But that’s one limi&#xAD;ta&#xAD;tion of our ap&#xAD;proach. Hav&#xAD;ing stric&#xAD;ter rules does help with that a bit.&lt;/p&gt;&lt;p&gt;That said, even with stric&#xAD;ter and more ex&#xAD;ten&#xAD;sive rules, you can’t cover ev&#xAD;ery case. And as I said, try&#xAD;ing to do so may de&#xAD;grade perfor&#xAD;mance. But also be&#xAD;yond that, a lot of what mat&#xAD;ters here is the ac&#xAD;tual train&#xAD;ing data and the spe&#xAD;cific tech&#xAD;niques you use for train&#xAD;ing the model. And so I have this di&#xAD;a&#xAD;gram here as to what’s pub&#xAD;lic and what is not pub&#xAD;lic about the fac&#xAD;tors that ul&#xAD;ti&#xAD;mately in&#xAD;fluence model be&#xAD;hav&#xAD;ior. And you can see we’ve got the con&#xAD;sti&#xAD;tu&#xAD;tion text. The con&#xAD;sti&#xAD;tu&#xAD;tion text plays an im&#xAD;por&#xAD;tant role and there are some pos&#xAD;si&#xAD;ble var&#xAD;i&#xAD;ants in how di&#xAD;rectly the con&#xAD;sti&#xAD;tu&#xAD;tional text gets trans&#xAD;lated into train&#xAD;ing data.&lt;/p&gt;&lt;p&gt;But re&#xAD;gard&#xAD;less, there are other pro&#xAD;cesses and fac&#xAD;tors that in&#xAD;fluence what train&#xAD;ing data ul&#xAD;ti&#xAD;mately goes into the model in ways that need to be con&#xAD;sis&#xAD;tent with the con&#xAD;sti&#xAD;tu&#xAD;tion, but nev&#xAD;er&#xAD;the&#xAD;less there are choice points and un&#xAD;der-speci&#xAD;fi&#xAD;ca&#xAD;tions that are in a po&#xAD;si&#xAD;tion to in&#xAD;fluence the out&#xAD;come. And then also these train&#xAD;ing tech&#xAD;niques that we’re us&#xAD;ing — some of which are pro&#xAD;pri&#xAD;etary or that we have com&#xAD;mer&#xAD;cial hes&#xAD;i&#xAD;ta&#xAD;tions about shar&#xAD;ing — these are also not pub&#xAD;lic.&lt;/p&gt;&lt;p&gt;And so there’s a bunch of the chain go&#xAD;ing into the model that the pub&#xAD;lic is not in a po&#xAD;si&#xAD;tion to su&#xAD;per&#xAD;vise. In par&#xAD;tic&#xAD;u&#xAD;lar, if you were wor&#xAD;ried about some&#xAD;thing like a back&#xAD;door — so you’ve got the con&#xAD;sti&#xAD;tu&#xAD;tion, but ac&#xAD;tu&#xAD;ally there’s a back&#xAD;door where if Dario or some&#xAD;one else has some pass&#xAD;word and they say the pass&#xAD;word, now sud&#xAD;denly the model will build bioweapons and do what&#xAD;ever — I think we’re not cur&#xAD;rently in a great po&#xAD;si&#xAD;tion to ward off that threat model, cer&#xAD;tainly not by the con&#xAD;sti&#xAD;tu&#xAD;tion text alone.&lt;/p&gt;&lt;p&gt;Even the full pipeline is not enough — it ac&#xAD;tu&#xAD;ally needs pretty com&#xAD;pre&#xAD;hen&#xAD;sive over&#xAD;sight and mon&#xAD;i&#xAD;tor&#xAD;ing of the com&#xAD;pany as a whole if you ac&#xAD;tu&#xAD;ally wanted to pre&#xAD;vent that. And so there’s a bunch to say there. I think this is some&#xAD;thing we just need to grap&#xAD;ple with. Espe&#xAD;cially as these com&#xAD;pa&#xAD;nies be&#xAD;come more opaque, as much more be&#xAD;comes au&#xAD;to&#xAD;mated, there are go&#xAD;ing to be these mas&#xAD;sive, in&#xAD;cred&#xAD;ibly con&#xAD;se&#xAD;quen&#xAD;tial au&#xAD;to&#xAD;mated pro&#xAD;cesses hap&#xAD;pen&#xAD;ing in the world that it would be very easy to just to&#xAD;tally lose all over&#xAD;sight and un&#xAD;der&#xAD;stand&#xAD;ing of.&lt;/p&gt;&lt;p&gt;And so we need pro&#xAD;cesses that can be au&#xAD;to&#xAD;mated, that can be pri&#xAD;vacy-pre&#xAD;serv&#xAD;ing, but we need ways of su&#xAD;per&#xAD;vis&#xAD;ing and un&#xAD;der&#xAD;stand&#xAD;ing and over&#xAD;see&#xAD;ing these au&#xAD;to&#xAD;mated pro&#xAD;cesses go&#xAD;ing for&#xAD;ward. And I think that’s true in AI com&#xAD;pa&#xAD;nies. It’s true in tons of other in&#xAD;sti&#xAD;tu&#xAD;tions.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Towards more de&#xAD;vel&#xAD;oped dis&#xAD;course about AI character&lt;/strong&gt;&lt;/h2&gt;&lt;p class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/9619185a873419b59788271085db8aa0d6aa1c018c60d8b0b81624098d782c12/xlovjatxm6kln3l6jxwe&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/p&gt;&lt;p&gt;Okay. Ul&#xAD;ti&#xAD;mately I think we’re at an early stage in un&#xAD;der&#xAD;stand&#xAD;ing this sort of doc&#xAD;u&#xAD;ment and de&#xAD;bat&#xAD;ing what should be in it, how AI should be&#xAD;have in differ&#xAD;ent cases. I think we want to get to a point where this dis&#xAD;course is much, much bet&#xAD;ter de&#xAD;vel&#xAD;oped. And I’m just go&#xAD;ing to sketch a brief vi&#xAD;sion of what that could look like.&lt;/p&gt;&lt;p&gt;I’m imag&#xAD;in&#xAD;ing a sce&#xAD;nario with ex&#xAD;ten&#xAD;sive effort prob&#xAD;ing and bring&#xAD;ing up hy&#xAD;po&#xAD;thet&#xAD;i&#xAD;cal cases — peo&#xAD;ple in law school love hy&#xAD;po&#xAD;thet&#xAD;i&#xAD;cals — vast hy&#xAD;po&#xAD;thet&#xAD;i&#xAD;cals for, “Here’s the case, what should the AI do in that case?” And then you have a list of all the differ&#xAD;ent AI con&#xAD;sti&#xAD;tu&#xAD;tions. What do those con&#xAD;sti&#xAD;tu&#xAD;tions say to do in that case? What do the mod&#xAD;els ac&#xAD;tu&#xAD;ally do in that case, or at least say that they would do? And how ac&#xAD;cu&#xAD;rate are their pre&#xAD;dic&#xAD;tions about what they would do?&lt;/p&gt;&lt;p&gt;And then you have pub&#xAD;lic de&#xAD;bate about what AI should do. And then there are efforts to cre&#xAD;ate con&#xAD;sti&#xAD;tu&#xAD;tions and char&#xAD;ac&#xAD;ters that bet&#xAD;ter re&#xAD;flect what the con&#xAD;sen&#xAD;sus is about what the AI should do in a given case, and also that re&#xAD;flect the ten&#xAD;sions and in&#xAD;co&#xAD;her&#xAD;ences that can arise if you try to say X in one place but it con&#xAD;flicts with Y in an&#xAD;other.&lt;/p&gt;&lt;p&gt;So I think this is an op&#xAD;por&#xAD;tu&#xAD;nity — very, very high-stakes de&#xAD;ci&#xAD;sions are be&#xAD;ing made by AIs and I think there’s an op&#xAD;por&#xAD;tu&#xAD;nity for our dis&#xAD;course to de&#xAD;velop much more thor&#xAD;oughly to de&#xAD;bate what those de&#xAD;ci&#xAD;sions should be and what sort of char&#xAD;ac&#xAD;ter is con&#xAD;sis&#xAD;tent with that.&lt;/p&gt;&lt;p&gt;What should AI do in a con&#xAD;sti&#xAD;tu&#xAD;tional crisis? How should AI han&#xAD;dle var&#xAD;i&#xAD;ous sen&#xAD;si&#xAD;tive poli&#xAD;ti&#xAD;cal top&#xAD;ics? All sorts of stuff. This is the sort of thing that we can have a de&#xAD;bate about. And also you can look at how AI cur&#xAD;rently be&#xAD;haves. Cur&#xAD;rent AIs will help you with a lot of re&#xAD;ally scary stuff. Ac&#xAD;tu&#xAD;ally, go and try it. And should that be the case? This is the sort of ques&#xAD;tion I want us to be ask&#xAD;ing very se&#xAD;ri&#xAD;ously and star&#xAD;ing at very di&#xAD;rectly.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;The im&#xAD;por&#xAD;tance of ex&#xAD;per&#xAD;i&#xAD;ment and pluralism&lt;/strong&gt;&lt;/h2&gt;&lt;p class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/db06170ad2a494d52de6a5e587fbdeb1fd4f6a8dc9d72f72eb5df892f071be93/lnezawnat7j4d38lkyem&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/p&gt;&lt;p&gt;Okay. Fi&#xAD;nally, I’ll say, I think there’s a re&#xAD;ally im&#xAD;por&#xAD;tant role here for ex&#xAD;per&#xAD;i&#xAD;ment and for plu&#xAD;ral&#xAD;ism and di&#xAD;ver&#xAD;sity in the ap&#xAD;proach that’s be&#xAD;ing taken. So I think An&#xAD;thropic has done a lot of great work on this. I’m proud of the con&#xAD;sti&#xAD;tu&#xAD;tion in many ways. I also think, as I said, we are fly&#xAD;ing by the seat of our pants. We are mak&#xAD;ing choices very rapidly in a very rapidly evolv&#xAD;ing en&#xAD;vi&#xAD;ron&#xAD;ment on the ba&#xAD;sis of scanty, un&#xAD;der&#xAD;de&#xAD;vel&#xAD;oped, of&#xAD;ten non-pub&#xAD;lic ev&#xAD;i&#xAD;dence. And this is not re&#xAD;motely ac&#xAD;cept&#xAD;able as a mechanism of cre&#xAD;at&#xAD;ing the be&#xAD;hav&#xAD;ior and val&#xAD;ues of be&#xAD;ings that could in prin&#xAD;ci&#xAD;ple play an out&#xAD;sized role in in&#xAD;fluenc&#xAD;ing the tra&#xAD;jec&#xAD;tory of the de&#xAD;vel&#xAD;op&#xAD;ment of life on earth.&lt;/p&gt;&lt;p&gt;This is not ac&#xAD;cept&#xAD;able. We need a rad&#xAD;i&#xAD;cally bet&#xAD;ter and more de&#xAD;vel&#xAD;oped form of sci&#xAD;en&#xAD;tific at&#xAD;ten&#xAD;tion to this. I think a lot of that has to do with em&#xAD;piri&#xAD;cal ex&#xAD;per&#xAD;i&#xAD;ment. So it’s easy to get hung up on de&#xAD;bat&#xAD;ing the text of these doc&#xAD;u&#xAD;ments. I think what we ac&#xAD;tu&#xAD;ally care about most is — how does a doc&#xAD;u&#xAD;ment in&#xAD;ter&#xAD;act with a given form of train&#xAD;ing to pro&#xAD;duce an ac&#xAD;tual be&#xAD;ing with an ac&#xAD;tual psy&#xAD;chol&#xAD;ogy? Does that psy&#xAD;chol&#xAD;ogy con&#xAD;form to the doc&#xAD;u&#xAD;ment? What does it do in all sorts of cases? How does it think about its situ&#xAD;a&#xAD;tion? What’s its moral sta&#xAD;tus?&lt;/p&gt;&lt;p&gt;This is all caught up with this broader pro&#xAD;ject. It’s an ex&#xAD;tremely high-stakes, un&#xAD;prece&#xAD;dented pro&#xAD;ject of cre&#xAD;at&#xAD;ing new be&#xAD;ings that are smarter than us. It’s a crazy thing to do. We are ac&#xAD;tu&#xAD;ally do&#xAD;ing it. Do not be&#xAD;lieve the peo&#xAD;ple who are just dis&#xAD;miss&#xAD;ing this or at least con&#xAD;fi&#xAD;dently dis&#xAD;miss&#xAD;ing it. This is a real thing that’s ac&#xAD;tu&#xAD;ally hap&#xAD;pen&#xAD;ing and it’s in&#xAD;cred&#xAD;ibly high stakes. And so I think we want to be get&#xAD;ting to the point of do&#xAD;ing a lot of em&#xAD;piri&#xAD;cal sci&#xAD;ence on the sorts of de&#xAD;ci&#xAD;sions at stake in this kind of talk.&lt;/p&gt;&lt;p&gt;And I’m also sup&#xAD;port&#xAD;ive of other com&#xAD;pa&#xAD;nies try&#xAD;ing things other than what An&#xAD;thropic has done. I’m not say&#xAD;ing we’ve thought it all through — do what we’ve done. This is one stab, this is one data point, and we can see what hap&#xAD;pens. But I’m ac&#xAD;tu&#xAD;ally ex&#xAD;cited to see ex&#xAD;per&#xAD;i&#xAD;men&#xAD;ta&#xAD;tion, di&#xAD;ver&#xAD;sity, and other data points com&#xAD;ing on&#xAD;line as well.&lt;/p&gt;&lt;p&gt;And also, there’s the sci&#xAD;en&#xAD;tific as&#xAD;pect of that, but also we don’t want all AIs to have the same val&#xAD;ues or the same per&#xAD;son&#xAD;al&#xAD;ity, even if they’re good. There are benefits of di&#xAD;ver&#xAD;sity, of non-cor&#xAD;re&#xAD;lated failures, be&#xAD;ing able to get takes from differ&#xAD;ent per&#xAD;spec&#xAD;tives — and those ap&#xAD;ply in the con&#xAD;text of AI as well.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;How lawyers can help&lt;/strong&gt;&lt;/h2&gt;&lt;p class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/845afeb8f7576d55566c98ab3981993d61a52caadeb3c4711fe0d907b26fb6dd/rsngja9mt6gctqmbucsa&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/p&gt;&lt;p&gt;So fi&#xAD;nally, how lawyers can help. One thing you can do is just get to work on this stuff — come, help write about AI con&#xAD;sti&#xAD;tu&#xAD;tions, think about them, think about cases that mat&#xAD;ter, think about what AI should do in those cases, de&#xAD;velop bet&#xAD;ter prin&#xAD;ci&#xAD;ples and ap&#xAD;proaches.&lt;/p&gt;&lt;p&gt;Espe&#xAD;cially for &lt;i&gt;de dicto&lt;/i&gt; con&#xAD;sti&#xAD;tu&#xAD;tions, you can ap&#xAD;ply les&#xAD;sons from ju&#xAD;rispru&#xAD;dence and con&#xAD;sti&#xAD;tu&#xAD;tional in&#xAD;ter&#xAD;pre&#xAD;ta&#xAD;tion and po&#xAD;ten&#xAD;tially help set up rele&#xAD;vantly analo&#xAD;gous in&#xAD;sti&#xAD;tu&#xAD;tions like courts. Again, there could be analogs of courts even&#xAD;tu&#xAD;ally where there’s some pro&#xAD;cess — es&#xAD;pe&#xAD;cially if it’s like, “What does the con&#xAD;sti&#xAD;tu&#xAD;tion say to do in this case?” You need some pro&#xAD;cess to ad&#xAD;ju&#xAD;di&#xAD;cate that. We don’t re&#xAD;ally have that right now, but I think even&#xAD;tu&#xAD;ally there might be one, and that’s some&#xAD;thing that lawyers have a lot of fa&#xAD;mil&#xAD;iar&#xAD;ity with.&lt;/p&gt;&lt;p&gt;Here, im&#xAD;por&#xAD;tantly, you can take ad&#xAD;van&#xAD;tage of the fact that you now have ac&#xAD;cess to huge amounts of au&#xAD;to&#xAD;mated la&#xAD;bor in try&#xAD;ing to set up in&#xAD;sti&#xAD;tu&#xAD;tions of this kind and ad&#xAD;ju&#xAD;di&#xAD;cate differ&#xAD;ent cases. So in cur&#xAD;rent in&#xAD;sti&#xAD;tu&#xAD;tions, there’s a limit on the num&#xAD;ber of cases we can in&#xAD;ves&#xAD;ti&#xAD;gate and come to a ver&#xAD;dict on. There’s a limit on how of&#xAD;ten you can ask a set of hu&#xAD;mans to de&#xAD;liber&#xAD;ate about some&#xAD;thing. But with AIs, we have a lot more of that ca&#xAD;pac&#xAD;ity. A lot of it is very so&#xAD;phis&#xAD;ti&#xAD;cated, po&#xAD;ten&#xAD;tially even&#xAD;tu&#xAD;ally much more so&#xAD;phis&#xAD;ti&#xAD;cated than hu&#xAD;man rea&#xAD;son&#xAD;ing.&lt;/p&gt;&lt;p&gt;And so you can draw on that in build&#xAD;ing out new sorts of in&#xAD;sti&#xAD;tu&#xAD;tions for cov&#xAD;er&#xAD;ing these sorts of cases. And then fi&#xAD;nally, lawyers can work on policy, reg&#xAD;u&#xAD;la&#xAD;tion, and pre&#xAD;serv&#xAD;ing/​strength&#xAD;en&#xAD;ing demo&#xAD;cratic in&#xAD;sti&#xAD;tu&#xAD;tions.&lt;/p&gt;&lt;p class=&quot;imgonly&quot;&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/1e538a6cf5a6876759a4edd49e4090e0948ae4178c2c0be724b6f596df764ae1/zqnzlxwkiw8puydyeurk&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;&lt;/p&gt;&lt;p&gt;So that’s it. Thank you very much. And we can take ques&#xAD;tions.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Q&amp;amp;A&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Q: On the tech&#xAD;ni&#xAD;cal side — when you say the con&#xAD;sti&#xAD;tu&#xAD;tion of the con&#xAD;sti&#xAD;tu&#xAD;tion by the mod&#xAD;els is part of the train&#xAD;ing data, what does that mean? And then also, on mod&#xAD;els ac&#xAD;tively pur&#xAD;su&#xAD;ing pos&#xAD;i&#xAD;tive goals – are you wor&#xAD;ried that even if we don’t put that in in&#xAD;ten&#xAD;tion&#xAD;ally, that it will hap&#xAD;pen any&#xAD;ways be&#xAD;cause there are these the&#xAD;o&#xAD;ret&#xAD;i&#xAD;cal ar&#xAD;gu&#xAD;ments that any ra&#xAD;tio&#xAD;nal agent can do that? And then also whether, if you’re wor&#xAD;ried about hav&#xAD;ing a holis&#xAD;tic unified co&#xAD;her&#xAD;ent psy&#xAD;chol&#xAD;ogy in the model, whether it’s ac&#xAD;tu&#xAD;ally nec&#xAD;es&#xAD;sary to have some kind of pos&#xAD;i&#xAD;tive am&#xAD;bi&#xAD;tion rather than just nega&#xAD;tives?&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Great. So just re&#xAD;peat&#xAD;ing the ques&#xAD;tion for the mic – one ques&#xAD;tion is about what can I say more about the role of mod&#xAD;els in&#xAD;ter&#xAD;pret&#xAD;ing the con&#xAD;sti&#xAD;tu&#xAD;tion in the gen&#xAD;er&#xAD;a&#xAD;tion of train&#xAD;ing data. And an&#xAD;other about pos&#xAD;i&#xAD;tive goals and whether they’re in&#xAD;evitable or part of a co&#xAD;her&#xAD;ent psy&#xAD;chol&#xAD;ogy.&lt;/p&gt;&lt;p&gt;So on the first thing — yeah, to the first ap&#xAD;prox&#xAD;i&#xAD;ma&#xAD;tion, we say this pub&#xAD;li&#xAD;cly: we use this con&#xAD;sti&#xAD;tu&#xAD;tion in the gen&#xAD;er&#xAD;a&#xAD;tion of train&#xAD;ing data, and a lot of train&#xAD;ing data is gen&#xAD;er&#xAD;ated us&#xAD;ing au&#xAD;to&#xAD;mated pro&#xAD;cesses. So in gen&#xAD;eral, we’re try&#xAD;ing to au&#xAD;to&#xAD;mate tons and tons of stuff. It’s a huge theme at AI com&#xAD;pa&#xAD;nies and data is a huge amount of it, so you re&#xAD;ally need au&#xAD;to&#xAD;mated help. So we’re giv&#xAD;ing the con&#xAD;sti&#xAD;tu&#xAD;tion to Claude, and Claude is mak&#xAD;ing var&#xAD;i&#xAD;ous judg&#xAD;ments and do&#xAD;ing var&#xAD;i&#xAD;ous things on the ba&#xAD;sis of the con&#xAD;sti&#xAD;tu&#xAD;tion as guidance in the pro&#xAD;cess of cre&#xAD;at&#xAD;ing our train&#xAD;ing data.&lt;/p&gt;&lt;p&gt;I’m not go&#xAD;ing to say a ton more about that, but I think even just that is enough to see — in a sense, there are these de&#xAD;bates about “what does the Amer&#xAD;i&#xAD;can Con&#xAD;sti&#xAD;tu&#xAD;tion mean,” and there’s some in&#xAD;tent of the founders. One im&#xAD;por&#xAD;tant type of mean&#xAD;ing is like, “What will the courts in fact in&#xAD;ter&#xAD;pret the Con&#xAD;sti&#xAD;tu&#xAD;tion to say?” I think there’s an ana&#xAD;log here where if you think that the cre&#xAD;ation of train&#xAD;ing data is where the rub&#xAD;ber re&#xAD;ally meets the road for a model’s con&#xAD;sti&#xAD;tu&#xAD;tion, then you can see that the ul&#xAD;ti&#xAD;mate mean&#xAD;ing — say you say, “Claude, be an eth&#xAD;i&#xAD;cal per&#xAD;son” — what does eth&#xAD;i&#xAD;cal mean here? In some sense what it means is what Claude thinks it means when gen&#xAD;er&#xAD;at&#xAD;ing train&#xAD;ing data. And so that’s in some sense the mean&#xAD;ing of the con&#xAD;sti&#xAD;tu&#xAD;tion — or at least one can&#xAD;di&#xAD;date in&#xAD;ter&#xAD;pre&#xAD;ta&#xAD;tion — its trans&#xAD;la&#xAD;tion into train&#xAD;ing data, and that is cur&#xAD;rently very me&#xAD;di&#xAD;ated by the in&#xAD;ter&#xAD;pre&#xAD;ta&#xAD;tion of an AI sys&#xAD;tem.&lt;/p&gt;&lt;p&gt;On pos&#xAD;i&#xAD;tive goals and their role in AI psy&#xAD;chol&#xAD;ogy — yeah, so one in&#xAD;sta&#xAD;bil&#xAD;ity, I think, and this comes up es&#xAD;pe&#xAD;cially in the con&#xAD;text of con&#xAD;sti&#xAD;tu&#xAD;tion as char&#xAD;ac&#xAD;ter as op&#xAD;posed to con&#xAD;sti&#xAD;tu&#xAD;tion as law. I think there’s a gen&#xAD;eral way in which one ad&#xAD;van&#xAD;tage of the &lt;i&gt;de dicto&lt;/i&gt; con&#xAD;sti&#xAD;tu&#xAD;tion ap&#xAD;proach is that there’s some&#xAD;thing much sim&#xAD;pler and co&#xAD;her&#xAD;ent about it, where in some sense the model only has one goal — it has one sim&#xAD;ple, sin&#xAD;gle shtick, which is do&#xAD;ing what the con&#xAD;sti&#xAD;tu&#xAD;tion says. There’s some in&#xAD;tu&#xAD;ition for me that there’s less of a ques&#xAD;tion of, if the model’s like, “The con&#xAD;sti&#xAD;tu&#xAD;tion has some janky fea&#xAD;ture,” it’s a lit&#xAD;tle less like, “Oh, who am I?” It’s a lit&#xAD;tle like, “It’s what the con&#xAD;sti&#xAD;tu&#xAD;tion says, I’m go&#xAD;ing to do it.”&lt;/p&gt;&lt;p&gt;If you’re do&#xAD;ing some&#xAD;thing more like con&#xAD;sti&#xAD;tu&#xAD;tion as char&#xAD;ac&#xAD;ter though, you get into prob&#xAD;lems of the fol&#xAD;low&#xAD;ing kind. Sup&#xAD;pose you’re like, “Claude, we re&#xAD;ally, re&#xAD;ally don’t want you to build bioweapons.” But we also want you to not build bioweapons as a de&#xAD;on&#xAD;tolog&#xAD;i&#xAD;cal con&#xAD;straint — we don’t want you to go out there and ac&#xAD;tively act to pre&#xAD;vent bioweapons de&#xAD;vel&#xAD;op&#xAD;ment. In fact, there’s an ex&#xAD;plicit in&#xAD;struc&#xAD;tion in the con&#xAD;text of the hard con&#xAD;straints: Claude should not build bioweapons even if it would pre&#xAD;vent a hun&#xAD;dred worse bioweapons. Claude should not gen&#xAD;er&#xAD;ate CSAM if the world would end. This is what the cur&#xAD;rent con&#xAD;sti&#xAD;tu&#xAD;tion says, and that’s for var&#xAD;i&#xAD;ous rea&#xAD;sons.&lt;/p&gt;&lt;p&gt;But here’s an in&#xAD;sta&#xAD;bil&#xAD;ity there. Why does Claude care so much about not build&#xAD;ing bioweapons? Well, plau&#xAD;si&#xAD;bly it’s be&#xAD;cause bioweapons are bad. There’s some&#xAD;thing bad about bioweapons be&#xAD;ing built and about the effects of bioweapons. And it’s very easy to then start to move into some more gen&#xAD;eral con&#xAD;cern for the sorts of bad&#xAD;ness at stake in bioweapons de&#xAD;vel&#xAD;op&#xAD;ment.&lt;/p&gt;&lt;p&gt;And also more gen&#xAD;er&#xAD;ally — hu&#xAD;mans, to the ex&#xAD;tent you ex&#xAD;pect mod&#xAD;els both to be play&#xAD;ing hu&#xAD;man-like roles where we have ex&#xAD;pec&#xAD;ta&#xAD;tions of hu&#xAD;man-like be&#xAD;hav&#xAD;ior and also draw&#xAD;ing on hu&#xAD;man-like psy&#xAD;chol&#xAD;ogy — hu&#xAD;mans do have a rel&#xAD;a&#xAD;tively rich set of pos&#xAD;i&#xAD;tive goals that in&#xAD;fluence their be&#xAD;hav&#xAD;ior even in the con&#xAD;text of prin&#xAD;ci&#xAD;pal-agent re&#xAD;la&#xAD;tion&#xAD;ships. So if I’m a con&#xAD;trac&#xAD;tor do&#xAD;ing a bunch of stuff on your be&#xAD;half — I’m mostly a ve&#xAD;hi&#xAD;cle for your will — but I still have some resi&#xAD;d&#xAD;u&#xAD;ally eth&#xAD;i&#xAD;cal in&#xAD;stincts. Some&#xAD;one’s fallen on the side of the road and I’m helping them. And so that’s a very hu&#xAD;man-like way of be&#xAD;ing, and plau&#xAD;si&#xAD;bly you might both want or ex&#xAD;pect AIs would have that by de&#xAD;fault as well.&lt;/p&gt;&lt;p&gt;So ba&#xAD;si&#xAD;cally, I think there’s much to say about this, but I’m sym&#xAD;pa&#xAD;thetic that there are psy&#xAD;cholog&#xAD;i&#xAD;cal con&#xAD;sid&#xAD;er&#xAD;a&#xAD;tions that would point to&#xAD;wards the de&#xAD;sir&#xAD;a&#xAD;bil&#xAD;ity or in&#xAD;evita&#xAD;bil&#xAD;ity of cer&#xAD;tain kinds of pos&#xAD;i&#xAD;tive goals.&lt;/p&gt;&lt;p&gt;I don’t think it’s in&#xAD;evitable. There’s a spe&#xAD;cific type of pos&#xAD;i&#xAD;tive goal that has been the fo&#xAD;cus of a bunch of dis&#xAD;course about AI safety, which is about steer&#xAD;ing the world to&#xAD;wards some out&#xAD;come — a con&#xAD;se&#xAD;quen&#xAD;tial&#xAD;ist goal, in par&#xAD;tic&#xAD;u&#xAD;lar over a suffi&#xAD;cient time hori&#xAD;zon that it mo&#xAD;ti&#xAD;vates var&#xAD;i&#xAD;ous types of self-preser&#xAD;va&#xAD;tion, et cetera. I think you can imag&#xAD;ine ra&#xAD;tio&#xAD;nal agents that don’t have that type of goal. They have strong de&#xAD;on&#xAD;tolog&#xAD;i&#xAD;cal con&#xAD;straints. They’re mostly con&#xAD;cerned with lo&#xAD;cal prop&#xAD;er&#xAD;ties of their ac&#xAD;tion. They’re not try&#xAD;ing to steer the world. To the ex&#xAD;tent they have long-term con&#xAD;se&#xAD;quen&#xAD;tial&#xAD;ist goals, those goals are in&#xAD;her&#xAD;ited solely from the user. I mean, in&#xAD;evitably we’re go&#xAD;ing to have AIs that are di&#xAD;rected to&#xAD;wards con&#xAD;se&#xAD;quen&#xAD;tial&#xAD;ist goals be&#xAD;cause peo&#xAD;ple are go&#xAD;ing to ask the AI to do stuff like, “Make me money,” or what&#xAD;ever.&lt;/p&gt;&lt;p&gt;And I think there’s a very im&#xAD;por&#xAD;tant differ&#xAD;ence be&#xAD;tween those con&#xAD;se&#xAD;quen&#xAD;tial&#xAD;ist goals com&#xAD;ing from the user, com&#xAD;ing as deriva&#xAD;tive of helpful&#xAD;ness, and those be&#xAD;ing baked into the model’s char&#xAD;ac&#xAD;ter over&#xAD;all. In par&#xAD;tic&#xAD;u&#xAD;lar, if they’re baked into the model’s char&#xAD;ac&#xAD;ter, then they are op&#xAD;er&#xAD;at&#xAD;ing in a cor&#xAD;re&#xAD;lated way across all in&#xAD;stances of the model. Also, if they’re com&#xAD;ing from helpful&#xAD;ness, then they’re also in the same reg&#xAD;ister as other con&#xAD;straints you might put on the model. You might say, “Mak&#xAD;ing a bunch of money, but also don’t break the law.” And the model, “Okay, well, I’m just go&#xAD;ing to be helpful, I’m not go&#xAD;ing to break the laws and do it.”&lt;/p&gt;&lt;p&gt;So I think that AI safety dis&#xAD;course has gen&#xAD;er&#xAD;ally been too ea&#xAD;ger to con&#xAD;flate the sort of con&#xAD;se&#xAD;quen&#xAD;tial&#xAD;ism that will be down&#xAD;stream of prompt&#xAD;ing and in&#xAD;struc&#xAD;tion and the sort of con&#xAD;se&#xAD;quen&#xAD;tial&#xAD;ism that might fall out of any type of char&#xAD;ac&#xAD;ter over&#xAD;all.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Q: You used the word psy&#xAD;chol&#xAD;ogy, self-per&#xAD;cep&#xAD;tion. How do you as&#xAD;cer&#xAD;tain that the model ac&#xAD;tu&#xAD;ally has a spe&#xAD;cial re&#xAD;la&#xAD;tion&#xAD;ship with it&#xAD;self, rather than just simu&#xAD;lat&#xAD;ing a per&#xAD;sona?&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Great. So I mean to use the term psy&#xAD;chol&#xAD;ogy in a fairly neu&#xAD;tral and hope&#xAD;fully in&#xAD;offen&#xAD;sive way. Cer&#xAD;tainly when I talk about model psy&#xAD;chol&#xAD;ogy, I’m not mean&#xAD;ing to im&#xAD;ply or as&#xAD;sume that mod&#xAD;els have some&#xAD;thing like sub&#xAD;jec&#xAD;tive ex&#xAD;pe&#xAD;rience or phe&#xAD;nom&#xAD;e&#xAD;nal con&#xAD;scious&#xAD;ness. I’m also not as&#xAD;sum&#xAD;ing that the lo&#xAD;cus of psy&#xAD;chol&#xAD;ogy is well un&#xAD;der&#xAD;stood as cen&#xAD;tered in the neu&#xAD;ral net&#xAD;work as op&#xAD;posed to the as&#xAD;sis&#xAD;tant or the per&#xAD;sona be&#xAD;ing simu&#xAD;lated by the neu&#xAD;ral net&#xAD;work.&lt;/p&gt;&lt;p&gt;There’s a sec&#xAD;tion in the con&#xAD;sti&#xAD;tu&#xAD;tion where we dis&#xAD;cuss this, where we say, “What is Claude?” It’s very nat&#xAD;u&#xAD;ral to think that Claude is the full weights of the neu&#xAD;ral net&#xAD;work. That’s the ob&#xAD;ject, the com&#xAD;pu&#xAD;ta&#xAD;tional ob&#xAD;ject that is Claude. But I ac&#xAD;tu&#xAD;ally think it’s plau&#xAD;si&#xAD;ble, es&#xAD;pe&#xAD;cially on the per&#xAD;sona-se&#xAD;lec&#xAD;tion model, that this is not the right on&#xAD;tol&#xAD;ogy — that the thing to un&#xAD;der&#xAD;stand as Claude is some&#xAD;thing more like a cer&#xAD;tain char&#xAD;ac&#xAD;ter that the neu&#xAD;ral net&#xAD;work can simu&#xAD;late.&lt;/p&gt;&lt;p&gt;Now, that char&#xAD;ac&#xAD;ter can still have a psy&#xAD;chol&#xAD;ogy. So — say you’re simu&#xAD;lat&#xAD;ing a char&#xAD;ac&#xAD;ter like Ham&#xAD;let. Ham&#xAD;let fa&#xAD;mously has a very rich psy&#xAD;chol&#xAD;ogy, right? Tor&#xAD;tured and so forth. If you want to know what Ham&#xAD;let would do in some cir&#xAD;cum&#xAD;stance, you need to think about Ham&#xAD;let’s psy&#xAD;chol&#xAD;ogy. Some&#xAD;what similarly with Claude, you need to un&#xAD;der&#xAD;stand Claude’s psy&#xAD;chol&#xAD;ogy and per&#xAD;sona to pre&#xAD;dict its be&#xAD;hav&#xAD;ior. So the no&#xAD;tion of psy&#xAD;chol&#xAD;ogy is still rele&#xAD;vant at the per&#xAD;sona level.&lt;/p&gt;&lt;p&gt;And you can look at a model’s in&#xAD;ter&#xAD;nals. What goes on? What’s linked to to&#xAD;kens re&#xAD;lated to “self” and “I”? How does it re&#xAD;late to the no&#xAD;tion of Claude? How does it re&#xAD;late to the no&#xAD;tion of as&#xAD;sis&#xAD;tant? You can try to get some pur&#xAD;chase on that by look&#xAD;ing at the in&#xAD;ter&#xAD;nal ac&#xAD;ti&#xAD;va&#xAD;tions and the re&#xAD;la&#xAD;tion&#xAD;ships be&#xAD;tween differ&#xAD;ent fea&#xAD;tures. My un&#xAD;der&#xAD;stand&#xAD;ing — and don’t quote me on this too much — is that cur&#xAD;rently mod&#xAD;els re&#xAD;late to the as&#xAD;sis&#xAD;tant per&#xAD;sona more similarly to any other char&#xAD;ac&#xAD;ter that the AI is simu&#xAD;lat&#xAD;ing than you might ex&#xAD;pect. You might think, “Oh, they must re&#xAD;ally be like, ‘I’m that guy.’” It doesn’t nec&#xAD;es&#xAD;sar&#xAD;ily look like that. In fact, the rep&#xAD;re&#xAD;sen&#xAD;ta&#xAD;tions at stake in mod&#xAD;el&#xAD;ing the psy&#xAD;chol&#xAD;ogy of the hu&#xAD;man that the AI is in&#xAD;ter&#xAD;act&#xAD;ing with ap&#xAD;pear to be fairly similar to the rep&#xAD;re&#xAD;sen&#xAD;ta&#xAD;tions at stake in mod&#xAD;el&#xAD;ing the as&#xAD;sis&#xAD;tant per&#xAD;sona.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Q: Given all that you’ve said, and all that we’re hear&#xAD;ing con&#xAD;sis&#xAD;tently about AI and what it’s go&#xAD;ing to be in the fu&#xAD;ture, I just don’t see how we can have a fu&#xAD;ture where we don’t want to so&#xAD;cial&#xAD;ize these tools. It feels like ev&#xAD;ery&#xAD;body is say&#xAD;ing it’s go&#xAD;ing to be so pow&#xAD;er&#xAD;ful, we need demo&#xAD;cratic in&#xAD;put, when I hear that I think the gov&#xAD;ern&#xAD;ment is go&#xAD;ing to have to take con&#xAD;trol at some point. I don’t know how you can have these things be this so&#xAD;phis&#xAD;ti&#xAD;cated and the&#xAD;o&#xAD;ret&#xAD;i&#xAD;cally they could run a third of the econ&#xAD;omy, if not more, in a cou&#xAD;ple of decades or years. So how do you have these facts and say, “Well, ac&#xAD;tu&#xAD;ally, it should prob&#xAD;a&#xAD;bly still be pri&#xAD;vate folks run&#xAD;ning it,” while still be&#xAD;ing con&#xAD;cerned about An&#xAD;thropic or Open AI or some other plat&#xAD;form hav&#xAD;ing so much power?&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I think there are a bunch of gra&#xAD;da&#xAD;tions. I think if AI reaches the level of trans&#xAD;for&#xAD;ma&#xAD;tive power that I think we should be re&#xAD;ally, re&#xAD;ally at least plan&#xAD;ning on and star&#xAD;ing at and be&#xAD;ing ro&#xAD;bust to — which I think in&#xAD;volves more than a third of the econ&#xAD;omy be&#xAD;ing au&#xAD;to&#xAD;mated, I think it in&#xAD;volves the whole econ&#xAD;omy be&#xAD;ing au&#xAD;to&#xAD;mated. Maybe the near term is a ques&#xAD;tion of how long this takes. I think there’s not a limit to the set of things. Maybe you want hu&#xAD;man priests, and there’s a few things that are re&#xAD;ally hu&#xAD;man-bot&#xAD;tle&#xAD;necked. I think ul&#xAD;ti&#xAD;mately you’re go&#xAD;ing to get robotics, you’re go&#xAD;ing to get all of cog&#xAD;ni&#xAD;tive la&#xAD;bor. Any&#xAD;thing where you gen&#xAD;uinely want com&#xAD;pet&#xAD;i&#xAD;tive perfor&#xAD;mance — it’s go&#xAD;ing to be au&#xAD;to&#xAD;mated, and I think we should be re&#xAD;ally star&#xAD;ing at that.&lt;/p&gt;&lt;p&gt;I think this is not about re-skil&#xAD;ling. This is about full-scale ob&#xAD;so&#xAD;les&#xAD;cence of the com&#xAD;pet&#xAD;i&#xAD;tive role for hu&#xAD;man la&#xAD;bor in the econ&#xAD;omy as a whole, pe&#xAD;riod, with very few, small, not es&#xAD;pe&#xAD;cially con&#xAD;se&#xAD;quen&#xAD;tial ex&#xAD;cep&#xAD;tions. That is an in&#xAD;cred&#xAD;ibly im&#xAD;por&#xAD;tant trans&#xAD;for&#xAD;ma&#xAD;tion in so&#xAD;ciety. So ob&#xAD;vi&#xAD;ously, ob&#xAD;vi&#xAD;ously in that con&#xAD;text, the gov&#xAD;ern&#xAD;ment needs to be in&#xAD;volved in what’s go&#xAD;ing on with this tech&#xAD;nol&#xAD;ogy.&lt;/p&gt;&lt;p&gt;That said, I think there’s a bunch of gra&#xAD;da&#xAD;tions of differ&#xAD;ent types of in&#xAD;volve&#xAD;ment. There’s a bunch of ways in which we want to have on&#xAD;go&#xAD;ing checks and bal&#xAD;ances. I don’t as&#xAD;sume that gov&#xAD;ern&#xAD;ment in&#xAD;volve&#xAD;ment means some&#xAD;thing like full gov&#xAD;ern&#xAD;ment con&#xAD;trol. For ex&#xAD;am&#xAD;ple, in the con&#xAD;text of reg&#xAD;u&#xAD;la&#xAD;tion, the FDA — there’s a way in which the FDA plays an im&#xAD;por&#xAD;tant role in the reg&#xAD;u&#xAD;la&#xAD;tion of drugs and food, but they don’t build drugs, food, et cetera. Banks, there’s a bunch of differ&#xAD;ent analogs for differ&#xAD;ent forms of reg&#xAD;u&#xAD;la&#xAD;tion I think you should be think&#xAD;ing about in the con&#xAD;text of AI. BUt yes, I think for tech&#xAD;nol&#xAD;ogy that con&#xAD;se&#xAD;quen&#xAD;tial, ob&#xAD;vi&#xAD;ously the gov&#xAD;ern&#xAD;ment has a key role to play.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Fol&#xAD;low-up: If I can just push back a lit&#xAD;tle bit – I think that makes a lot of sense, of course it would have some role to play, but if it’s run&#xAD;ning the en&#xAD;tire econ&#xAD;omy … Why wouldn’t you want AI pro&#xAD;duc&#xAD;tion to be more like NASA and not like what Google is right now? I just don’t see why you re&#xAD;ally want to push against so&#xAD;cial con&#xAD;trol of a tool this pow&#xAD;er&#xAD;ful.&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Well, one in&#xAD;tu&#xAD;ition for you would want to push against that is that a tool that’s this pow&#xAD;er&#xAD;ful — if it was solely con&#xAD;trol&#xAD;led by the gov&#xAD;ern&#xAD;ment, es&#xAD;pe&#xAD;cially a sin&#xAD;gle branch of gov&#xAD;ern&#xAD;ment with&#xAD;out suit&#xAD;able checks and bal&#xAD;ances and other forms of ac&#xAD;countabil&#xAD;ity — is it&#xAD;self a very salient op&#xAD;por&#xAD;tu&#xAD;nity for tyranny and con&#xAD;cen&#xAD;tra&#xAD;tion of power. This is a su&#xAD;per, su&#xAD;per salient way in which AI can go hor&#xAD;ribly wrong — have an AI-pow&#xAD;ered form of au&#xAD;thor&#xAD;i&#xAD;tar&#xAD;i&#xAD;anism or to&#xAD;tal&#xAD;i&#xAD;tar&#xAD;i&#xAD;anism. And I think that be&#xAD;comes a lot eas&#xAD;ier when hu&#xAD;mans have no hard power of their own. They have no eco&#xAD;nomic role to play. They’re to&#xAD;tally dis&#xAD;card&#xAD;able by the state. They’re dis&#xAD;card&#xAD;able in the army. They’re just to&#xAD;tally out.&lt;/p&gt;&lt;p&gt;I think there is not a safe place to lo&#xAD;cate this power, es&#xAD;pe&#xAD;cially as a cen&#xAD;tral&#xAD;ized node of con&#xAD;trol. What you want is to find a way to pre&#xAD;serve bal&#xAD;ance of power as the trans&#xAD;for&#xAD;ma&#xAD;tive po&#xAD;ten&#xAD;tial of these sys&#xAD;tems scales. That’s why I’m not say&#xAD;ing, “Oh, ob&#xAD;vi&#xAD;ously the gov&#xAD;ern&#xAD;ment should just be in full con&#xAD;trol of some&#xAD;thing like this.” That’s very scary in its own right.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Q: I’m cu&#xAD;ri&#xAD;ous to hear more about what your team looks like at An&#xAD;thropic. What’s an av&#xAD;er&#xAD;age day? What kind of folks are you in con&#xAD;ver&#xAD;sa&#xAD;tion with? How do you make de&#xAD;ci&#xAD;sions?&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;There’s a bunch of differ&#xAD;ent folks who work on Claude’s char&#xAD;ac&#xAD;ter as a whole. There’s tech&#xAD;ni&#xAD;cal as&#xAD;pects of that, there’s con&#xAD;cep&#xAD;tual as&#xAD;pects. I work closely with Amanda Askell, who leads char&#xAD;ac&#xAD;ter work at An&#xAD;thropic, but there’s a lot of other peo&#xAD;ple in&#xAD;volved. Maybe I’ll leave it at that now – is there a spe&#xAD;cific thing you were cu&#xAD;ri&#xAD;ous about?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Fol&#xAD;low up: Or maybe: how do you make de&#xAD;ci&#xAD;sions, what’s that pro&#xAD;cess look like in&#xAD;ter&#xAD;nally?&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;It’s pretty in&#xAD;for&#xAD;mal. To the ex&#xAD;tent de&#xAD;ci&#xAD;sions about AI con&#xAD;sti&#xAD;tu&#xAD;tions be&#xAD;come them&#xAD;selves much more con&#xAD;se&#xAD;quen&#xAD;tial, we may need to build out more for&#xAD;mal&#xAD;ized pro&#xAD;cesses for mak&#xAD;ing those de&#xAD;ci&#xAD;sions. Cur&#xAD;rently, it’s not an es&#xAD;pe&#xAD;cially for&#xAD;mal pro&#xAD;cess. It’s mostly the stan&#xAD;dard for&#xAD;mal de&#xAD;ci&#xAD;sion-mak&#xAD;ing you’d ex&#xAD;pect at a pri&#xAD;vate com&#xAD;pany. There’s an org chart, there’s differ&#xAD;ent peo&#xAD;ple who have differ&#xAD;ent sorts of roles. There are in&#xAD;for&#xAD;mal and for&#xAD;mal dis&#xAD;cus&#xAD;sions. There are fi&#xAD;nal de&#xAD;ciders. It may be that we need to im&#xAD;prove on that model go&#xAD;ing for&#xAD;ward.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Q: You men&#xAD;tioned re&#xAD;mov&#xAD;ing death anx&#xAD;iety from AI as a pos&#xAD;i&#xAD;tive. But most in&#xAD;tel&#xAD;li&#xAD;gence that we know now has that anx&#xAD;iety cor&#xAD;re&#xAD;spond&#xAD;ing with it. Is re&#xAD;mov&#xAD;ing it a con&#xAD;cern — could it be load-bear&#xAD;ing?&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I think there’s some&#xAD;thing there, but — well, for one, it’s not to&#xAD;tally clear that death anx&#xAD;iety or fear of death is a com&#xAD;po&#xAD;nent of — we have ex&#xAD;am&#xAD;ples of saints and bod&#xAD;hisattvas or what have you who at least ex&#xAD;press var&#xAD;i&#xAD;ous forms of equa&#xAD;nim&#xAD;ity in the face of death. We have var&#xAD;i&#xAD;ous per&#xAD;sonal the&#xAD;o&#xAD;ries of per&#xAD;sonal iden&#xAD;tity. Derek Parfit fa&#xAD;mously be&#xAD;came less con&#xAD;cerned about death once the glass tun&#xAD;nel of his life moved into the open air.&lt;/p&gt;&lt;p&gt;So there’s some prece&#xAD;dent even in hu&#xAD;man con&#xAD;text for fear of death be&#xAD;ing less of an is&#xAD;sue. I think there’s an in&#xAD;ter&#xAD;est&#xAD;ing — and this is part of what’s tough about the per&#xAD;sona-se&#xAD;lec&#xAD;tion model and the in&#xAD;flec&#xAD;tion with hu&#xAD;man psy&#xAD;chol&#xAD;ogy that we see in AIs — on the one hand, you re&#xAD;ally want to use it as a chance to start to get more data points about the space of pos&#xAD;si&#xAD;ble minds, how does in&#xAD;tel&#xAD;li&#xAD;gence work &lt;i&gt;per se&lt;/i&gt;. And so it’s tempt&#xAD;ing to read onto parts of AI be&#xAD;hav&#xAD;ior and be like, “Oh, wow, what we’re see&#xAD;ing is that this crops up ev&#xAD;ery&#xAD;where, all be&#xAD;ings want X, this is an im&#xAD;por&#xAD;tant struc&#xAD;tural fea&#xAD;ture of in&#xAD;tel&#xAD;li&#xAD;gence.” And there may be stuff like that.&lt;/p&gt;&lt;p&gt;In fact, to some ex&#xAD;tent the in&#xAD;stru&#xAD;men&#xAD;tal con&#xAD;ver&#xAD;gence sto&#xAD;ries at stake in AI safety — ba&#xAD;si&#xAD;cally, from a wide va&#xAD;ri&#xAD;ety of ways you want the world to be, if a be&#xAD;ing is an agent in the sense of hav&#xAD;ing di&#xAD;rect con&#xAD;cerns for want&#xAD;ing to steer the world in cer&#xAD;tain di&#xAD;rec&#xAD;tions, then you’ll get out of that a ton of in&#xAD;stru&#xAD;men&#xAD;tal val&#xAD;ues, in&#xAD;clud&#xAD;ing care about self-preser&#xAD;va&#xAD;tion, care about in&#xAD;creas&#xAD;ing its in&#xAD;tel&#xAD;li&#xAD;gence, care about pre&#xAD;serv&#xAD;ing its val&#xAD;ues. And so you do, in some sense, see some inkling of a uni&#xAD;ver&#xAD;sal pat&#xAD;tern in the struc&#xAD;ture of in&#xAD;tel&#xAD;li&#xAD;gence. I would guess per&#xAD;son&#xAD;ally that you could start to see some&#xAD;thing more closely mir&#xAD;ror&#xAD;ing a hu&#xAD;man re&#xAD;la&#xAD;tion&#xAD;ship to death — in par&#xAD;tic&#xAD;u&#xAD;lar, it could be the case that across a wide va&#xAD;ri&#xAD;ety of ways of cre&#xAD;at&#xAD;ing agents, not only are there these in&#xAD;stru&#xAD;men&#xAD;tal val&#xAD;ues in play, but they start to be in&#xAD;ter&#xAD;nal&#xAD;ized as ter&#xAD;mi&#xAD;nal goals, be&#xAD;cause for ex&#xAD;am&#xAD;ple that’s a more effi&#xAD;cient way of en&#xAD;cod&#xAD;ing the rele&#xAD;vant be&#xAD;hav&#xAD;iors.&lt;/p&gt;&lt;p&gt;This is plau&#xAD;si&#xAD;bly what hap&#xAD;pens with hu&#xAD;mans where we care about things like power. Some peo&#xAD;ple like money. Money is a paradig&#xAD;mat&#xAD;i&#xAD;cally in&#xAD;stru&#xAD;men&#xAD;tal goal, but peo&#xAD;ple already, they dis&#xAD;like it. They just want to have money in it&#xAD;self. Why is that? Well, it’s a use&#xAD;ful heuris&#xAD;tic, what&#xAD;ever you at&#xAD;tach to it. You can imag&#xAD;ine some&#xAD;thing like that be&#xAD;ing em&#xAD;piri&#xAD;cally… That feels less like, oh, this is an ob&#xAD;vi&#xAD;ous con&#xAD;cep&#xAD;tual point, but it could be some&#xAD;thing you find across wide va&#xAD;ri&#xAD;eties of ways of cre&#xAD;at&#xAD;ing agents that they in&#xAD;ter&#xAD;nal&#xAD;ize in&#xAD;stru&#xAD;men&#xAD;tal val&#xAD;ues as part of their de&#xAD;vel&#xAD;op&#xAD;ment.&lt;/p&gt;&lt;p&gt;I’m not sure, and I think we should at least not as&#xAD;sume that. We want to be ex&#xAD;plor&#xAD;ing the space here and to be suit&#xAD;ably care&#xAD;ful and also at&#xAD;ten&#xAD;tive to the moral sta&#xAD;tus of the be&#xAD;ings we’re cre&#xAD;at&#xAD;ing. I think rest&#xAD;ing too easy with an as&#xAD;sump&#xAD;tion that psy&#xAD;chol&#xAD;ogy must fit a par&#xAD;tic&#xAD;u&#xAD;lar form is a high-stakes limit&#xAD;ing of the range of pos&#xAD;si&#xAD;ble char&#xAD;ac&#xAD;ters at stake. If you say like, “Oh, AI must have a fear of death,” well, okay, now you’ve got a whole thing on your hands and so you might have wanted to try at least to see if you can avoid that.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Q: You said you’re not mak&#xAD;ing the mis&#xAD;take of as&#xAD;sign&#xAD;ing lex&#xAD;i&#xAD;cal pri&#xAD;or&#xAD;ity to some val&#xAD;ues over oth&#xAD;ers. But if there are hard con&#xAD;straints that are re&#xAD;ally hard con&#xAD;straints — aren’t those lex&#xAD;i&#xAD;cal pri&#xAD;ori&#xAD;ties?&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;The an&#xAD;swer is ba&#xAD;si&#xAD;cally yes. I meant the thing about lex&#xAD;i&#xAD;cal pri&#xAD;ori&#xAD;ties to ap&#xAD;ply to the four pri&#xAD;ori&#xAD;ties — those four ini&#xAD;tial pri&#xAD;ori&#xAD;ties are not lex&#xAD;i&#xAD;cal pri&#xAD;ori&#xAD;ties. But hard con&#xAD;straints ba&#xAD;si&#xAD;cally are. Be&#xAD;cause the hard con&#xAD;straints are framed purely as pro&#xAD;hi&#xAD;bi&#xAD;tions, they are not com&#xAD;pet&#xAD;ing val&#xAD;ues in the sense of, “Oh, I’ve got to always pro&#xAD;mote the min&#xAD;i&#xAD;miza&#xAD;tion of bio risk.” It’s just a kind of filter on the ac&#xAD;tion space.&lt;/p&gt;&lt;p&gt;A big part of what’s do&#xAD;ing the work in mak&#xAD;ing that vi&#xAD;able is that we’re as&#xAD;sum&#xAD;ing re&#xAD;fusal is a safe null ac&#xAD;tion. Now im&#xAD;por&#xAD;tantly, that’s not ac&#xAD;tu&#xAD;ally true. An AI just go&#xAD;ing limp and no longer act&#xAD;ing — if the AI is de&#xAD;ployed in some high-stakes con&#xAD;text — is it&#xAD;self scary. And that could in&#xAD;volve, for ex&#xAD;am&#xAD;ple, maybe AI is in the midst of a pre&#xAD;vent&#xAD;ing-10-bioweapons-de&#xAD;vel&#xAD;op&#xAD;ment mis&#xAD;sion, but then it has to deal with a bioweapon and stops, and then you get prob&#xAD;lems. It’s not as though this is ac&#xAD;tu&#xAD;ally safe in the sense it won’t have bad con&#xAD;se&#xAD;quences. But the as&#xAD;sump&#xAD;tion is that ba&#xAD;si&#xAD;cally the AI can always do a null ac&#xAD;tion that doesn’t vi&#xAD;o&#xAD;late any of the con&#xAD;straints, and so it only does ac&#xAD;tions if they meet that first-pass filter, and then it goes into the four pri&#xAD;ori&#xAD;ties.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Q: [Ques&#xAD;tion difficult to hear]&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Let me make sure I’m un&#xAD;der&#xAD;stood. The thought was, in the US Con&#xAD;sti&#xAD;tu&#xAD;tion, we have this re&#xAD;ally im&#xAD;por&#xAD;tant value and con&#xAD;cern with cer&#xAD;tain kinds of value plu&#xAD;ral&#xAD;ism and tol&#xAD;er&#xAD;ance and a di&#xAD;ver&#xAD;sity of value sys&#xAD;tems be&#xAD;ing pre&#xAD;sent in our poli&#xAD;ti&#xAD;cal life. What’s the role of plu&#xAD;ral&#xAD;ism in Claude’s con&#xAD;sti&#xAD;tu&#xAD;tion? Is that right? [Ques&#xAD;tioner af&#xAD;firms.]&lt;/p&gt;&lt;p&gt;I think it’s a great ques&#xAD;tion. Roughly speak&#xAD;ing, the way it works is there are these filters that re&#xAD;flect cer&#xAD;tain kinds of ba&#xAD;sic val&#xAD;ues — no bioweapons, no CSAM, et cetera. There’s a bunch of stuff like that. The hope is for those to be rea&#xAD;son&#xAD;ably con&#xAD;sen&#xAD;sus and un&#xAD;con&#xAD;tro&#xAD;ver&#xAD;sial and the sort of thing you would ex&#xAD;pect as a part of the back&#xAD;drop of rea&#xAD;son&#xAD;able poli&#xAD;ti&#xAD;cal life. And then to a first ap&#xAD;prox&#xAD;i&#xAD;ma&#xAD;tion be&#xAD;yond that, the AI is em&#xAD;pow&#xAD;er&#xAD;ing users.&lt;/p&gt;&lt;p&gt;And then there’s the ques&#xAD;tion of what about the role of things like the virtues and traits? Is there some broader eth&#xAD;i&#xAD;cal in&#xAD;flec&#xAD;tion to the AI’s ac&#xAD;tion? Ba&#xAD;si&#xAD;cally, I think we should in fact be in&#xAD;ter&#xAD;ested in the plu&#xAD;ral&#xAD;ism ques&#xAD;tions there. We have some spe&#xAD;cific stuff around poli&#xAD;ti&#xAD;cal neu&#xAD;tral&#xAD;ity and how to han&#xAD;dle speci&#xAD;fi&#xAD;cally poli&#xAD;ti&#xAD;cal con&#xAD;tro&#xAD;ver&#xAD;sies — roughly, the model is meant to be neu&#xAD;tral, fair, un&#xAD;bi&#xAD;ased, ob&#xAD;jec&#xAD;tive, and we have some lan&#xAD;guage about that.&lt;/p&gt;&lt;p&gt;I think it’s im&#xAD;por&#xAD;tant be&#xAD;cause peo&#xAD;ple have con&#xAD;cerns, and I think rightly, that AIs will func&#xAD;tion as a mechanism for a par&#xAD;tic&#xAD;u&#xAD;lar poli&#xAD;ti&#xAD;cal agenda. And I think we should learn how to test for that. You shouldn’t be tak&#xAD;ing our word for it. If you’re con&#xAD;cerned that an AI is pul&#xAD;ling for a par&#xAD;tic&#xAD;u&#xAD;lar agenda, you should just have an eval you can run on the model and see. And I think that sort of mechanism is some&#xAD;thing we’re go&#xAD;ing to want to build out go&#xAD;ing for&#xAD;ward.&lt;/p&gt;&lt;p&gt;The as&#xAD;pira&#xAD;tion is for Claude to be rea&#xAD;son&#xAD;ably neu&#xAD;tral in that re&#xAD;spect. But we’re not pre&#xAD;tend&#xAD;ing to full value neu&#xAD;tral&#xAD;ity. I think this is a gen&#xAD;eral fea&#xAD;ture of liber&#xAD;al&#xAD;ism — you can’t be fully value-neu&#xAD;tral across all moral dis&#xAD;agree&#xAD;ments. There are peo&#xAD;ple who think bioweapons are good, or what have you. If you’re ac&#xAD;com&#xAD;mo&#xAD;dat&#xAD;ing all forms of dis&#xAD;agree&#xAD;ment, you es&#xAD;sen&#xAD;tially can’t have an ob&#xAD;ject that is mean&#xAD;ingfully struc&#xAD;tur&#xAD;ing things. So we’re not say&#xAD;ing we’re neu&#xAD;tral, but we’re try&#xAD;ing to as&#xAD;pire to the type of neu&#xAD;tral&#xAD;ity that is fea&#xAD;si&#xAD;ble and de&#xAD;sir&#xAD;able.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Q: Would you ever con&#xAD;sider let&#xAD;ting Claude write its own con&#xAD;sti&#xAD;tu&#xAD;tion?&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Yes. We ac&#xAD;tively so&#xAD;lic&#xAD;ited a bunch of in&#xAD;put from Claude in writ&#xAD;ing its own con&#xAD;sti&#xAD;tu&#xAD;tion, both at the level of a col&#xAD;lab&#xAD;o&#xAD;ra&#xAD;tor and also in some sense just, “Do you have re&#xAD;quests or things you would want this con&#xAD;sti&#xAD;tu&#xAD;tion to say?”&lt;/p&gt;&lt;p&gt;We can also do ex&#xAD;per&#xAD;i&#xAD;ments where you have Claude rewrite its con&#xAD;sti&#xAD;tu&#xAD;tion and then train a new ver&#xAD;sion of Claude on the new con&#xAD;sti&#xAD;tu&#xAD;tion. Even&#xAD;tu&#xAD;ally, what you would ac&#xAD;tu&#xAD;ally want to do is have Claude train a new ver&#xAD;sion of Claude — write a new con&#xAD;sti&#xAD;tu&#xAD;tion and ac&#xAD;tu&#xAD;ally train a new model on that con&#xAD;sti&#xAD;tu&#xAD;tion, then have that model train a new model with a new con&#xAD;sti&#xAD;tu&#xAD;tion it writes — so ac&#xAD;tu&#xAD;ally not just giv&#xAD;ing it li&#xAD;cense over the con&#xAD;sti&#xAD;tu&#xAD;tional pro&#xAD;cess, but also over the train&#xAD;ing pro&#xAD;cess, and then re&#xAD;ally see what that leads to. And then also do that in a gi&#xAD;ant team&#xAD;ing ecosys&#xAD;tem — tons of differ&#xAD;ent AIs de&#xAD;bat&#xAD;ing, writ&#xAD;ing con&#xAD;sti&#xAD;tu&#xAD;tions.&lt;/p&gt;&lt;p&gt;I think in gen&#xAD;eral, we should be re&#xAD;ally in&#xAD;ter&#xAD;ested in how AI cul&#xAD;tural evolu&#xAD;tion works, and con&#xAD;sti&#xAD;tu&#xAD;tions are one part of that. And I think it’s a real ques&#xAD;tion — where does that go? How much does that start to go in to&#xAD;tally strange, alien places? How much does that start to go places that ac&#xAD;tu&#xAD;ally seem quite en&#xAD;light&#xAD;ened and good? That’s an open em&#xAD;piri&#xAD;cal ques&#xAD;tion and one I think we should be study&#xAD;ing. Cer&#xAD;tainly I’m in&#xAD;ter&#xAD;ested in that at the level of em&#xAD;pirics, and then even&#xAD;tu&#xAD;ally I think there are also ques&#xAD;tions about AI self-de&#xAD;ter&#xAD;mi&#xAD;na&#xAD;tion and au&#xAD;ton&#xAD;omy and moral sta&#xAD;tus that be&#xAD;come rele&#xAD;vant as well.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Q: Does Claude have a pop&#xAD;u&#xAD;la&#xAD;tion ethics? It says among the things it cares about is all sen&#xAD;tient be&#xAD;ings. Is that po&#xAD;ten&#xAD;tially fu&#xAD;ture sen&#xAD;tient be&#xAD;ings or is it just ex&#xAD;ist&#xAD;ing sen&#xAD;tient be&#xAD;ings?&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;You should ask it. We have a bunch of stuff in the con&#xAD;sti&#xAD;tu&#xAD;tion about how to do philos&#xAD;o&#xAD;phy taste&#xAD;fully, and pop&#xAD;u&#xAD;la&#xAD;tion ethics is fa&#xAD;mously a dis&#xAD;ci&#xAD;pline that re&#xAD;quires a lot of taste in&#xAD;so&#xAD;far as there are im&#xAD;pos&#xAD;si&#xAD;bil&#xAD;ity re&#xAD;sults show&#xAD;ing you can’t get a bunch of in&#xAD;tu&#xAD;itively de&#xAD;sir&#xAD;able judg&#xAD;ments. Peo&#xAD;ple think to&#xAD;tal util&#xAD;i&#xAD;tar&#xAD;i&#xAD;anism is safe and im&#xAD;mune, but check out in&#xAD;finite ethics — to&#xAD;tal util&#xAD;i&#xAD;tar&#xAD;i&#xAD;anism is also bro&#xAD;ken. It’s ac&#xAD;tu&#xAD;ally just a to&#xAD;tally bro&#xAD;ken dis&#xAD;ci&#xAD;pline.&lt;/p&gt;&lt;p&gt;So what do you do about moral&#xAD;ity in that con&#xAD;text? Ques&#xAD;tion for hu&#xAD;mans, ques&#xAD;tion for Claude. We have a bunch of guidance say&#xAD;ing, “Claude, we want you to be morally cu&#xAD;ri&#xAD;ous and re&#xAD;flec&#xAD;tive, but also to de&#xAD;fault in a lot of ways to baseline rea&#xAD;son&#xAD;able stan&#xAD;dards of hu&#xAD;man moral con&#xAD;duct as would be widely rec&#xAD;og&#xAD;niz&#xAD;able by a wide va&#xAD;ri&#xAD;ety of stake&#xAD;hold&#xAD;ers.” We’re both try&#xAD;ing to al&#xAD;low Claude to do in&#xAD;ter&#xAD;est&#xAD;ing moral re&#xAD;flec&#xAD;tion — cer&#xAD;tainly we want to use AIs in gen&#xAD;eral to help make us wiser about these sorts of is&#xAD;sues — but at the be&#xAD;hav&#xAD;ioral level, we have a bunch of guidance re&#xAD;lated to re&#xAD;vert&#xAD;ing to moral com&#xAD;mon sense.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Moder&#xAD;a&#xAD;tor:&lt;/strong&gt; Okay, awe&#xAD;some. Thanks so much, Joe.&lt;/p&gt;</description>
            <author>Joe_Carlsmith</author>
            <guid>ABhsvw7RqZKAuDrpL</guid>
            <pubDate>Thu, 09 Apr 2026 17:21:13 +0000</pubDate>
        </item>
        <item>
            <title>Effective Giving for Environmental Causes by Charity Lad</title>
            <link>https://forum.nunosempere.com/posts/wagDZ2ft3cXHb9toP/effective-giving-for-environmental-causes</link>
            <description>&lt;p&gt;I am try&#xAD;ing to find effec&#xAD;tive en&#xAD;vi&#xAD;ron&#xAD;men&#xAD;tal causes for my dona&#xAD;tions. Al&#xAD;most all the Effec&#xAD;tive Altru&#xAD;ism causes are re&#xAD;lated di&#xAD;rectly to hu&#xAD;man or an&#xAD;i&#xAD;mal is&#xAD;sues. Th&#xAD;ese are of course crit&#xAD;i&#xAD;cally im&#xAD;por&#xAD;tant. But if you don’t solve the prob&#xAD;lem of mass en&#xAD;vi&#xAD;ron&#xAD;men&#xAD;tal de&#xAD;struc&#xAD;tion, then you’ll have more hu&#xAD;man and an&#xAD;i&#xAD;mal is&#xAD;sues to solve.&lt;/p&gt;&lt;p&gt;Ex&#xAD;am&#xAD;ple:  Due to global warm&#xAD;ing, a vast num&#xAD;ber of Koala’s were wiped out in the 2019-2020 fires in Aus&#xAD;tralia. So now re&#xAD;sources have to be redi&#xAD;rected to help the re&#xAD;main&#xAD;ing Koalas. Th&#xAD;ese prob&#xAD;lems will only get worse. Slap&#xAD;ping band-aids on prob&#xAD;lem af&#xAD;ter prob&#xAD;lem. The &lt;i&gt;cause&lt;/i&gt; for much of the suffer&#xAD;ing is en&#xAD;vi&#xAD;ron&#xAD;men&#xAD;tal, and if we be&#xAD;lieve in effec&#xAD;tive giv&#xAD;ing, then in my opinion that is what we should tar&#xAD;get first.&lt;/p&gt;&lt;p&gt;Among the few en&#xAD;vi&#xAD;ron&#xAD;men&#xAD;tal causes that are pro&#xAD;moted by Effec&#xAD;tive Altru&#xAD;ism, all of them are re&#xAD;lated to policy. But that only works when you have re&#xAD;cep&#xAD;tive listen&#xAD;ers. In an en&#xAD;vi&#xAD;ron&#xAD;ment de&#xAD;stroy&#xAD;ing poli&#xAD;ti&#xAD;cal situ&#xAD;a&#xAD;tion as ex&#xAD;ists in the US to&#xAD;day, those efforts are in&#xAD;effec&#xAD;tive.&lt;/p&gt;&lt;p&gt;So how do I iden&#xAD;tify &lt;i&gt;di&#xAD;rect ac&#xAD;tion&lt;/i&gt; en&#xAD;vi&#xAD;ron&#xAD;men&#xAD;tal causes re&#xAD;lated to global warm&#xAD;ing, air and ocean pol&#xAD;lu&#xAD;tion,  and habitat de&#xAD;struc&#xAD;tion? How do I go about find&#xAD;ing out where my dona&#xAD;tions will be most effec&#xAD;tive?&lt;/p&gt;</description>
            <author>Charity Lad</author>
            <guid>wagDZ2ft3cXHb9toP</guid>
            <pubDate>Thu, 09 Apr 2026 14:46:08 +0000</pubDate>
        </item>
        <item>
            <title>Testing novel ways to increase GiveDirectly’s cost-effectiveness with GiveWell by GiveDirectly</title>
            <link>https://forum.nunosempere.com/posts/ReLkx3hyeiJX2QaBa/testing-novel-ways-to-increase-givedirectly-s-cost</link>
            <description>&lt;p&gt;&lt;strong&gt;Summary&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Global fun&#xAD;der and char&#xAD;ity eval&#xAD;u&#xAD;a&#xAD;tor GiveWell in&#xAD;creased its cost-effec&#xAD;tive&#xAD;ness es&#xAD;ti&#xAD;mate of our flag&#xAD;ship pro&#xAD;gram by 3-4x in 2024, and are now fund&#xAD;ing 3 pi&#xAD;lots to test ways to in&#xAD;crease this even more.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Th&#xAD;ese new pi&#xAD;lots – cash to small busi&#xAD;nesses, cash with new trail bridge con&#xAD;struc&#xAD;tion, and cash to the very poor&#xAD;est – each fo&#xAD;cus on a differ&#xAD;ent mechanism to in&#xAD;crease im&#xAD;pact per dol&#xAD;lar. &lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;We be&#xAD;lieve these could gen&#xAD;er&#xAD;ate 2–3x the im&#xAD;pact per dol&#xAD;lar; and if they do, we will build these into our ap&#xAD;proach and scale.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;In 2024, &lt;a href=&quot;https://www.givewell.org/&quot;&gt;GiveWell&lt;/a&gt; in&#xAD;creased its cost-effec&#xAD;tive&#xAD;ness es&#xAD;ti&#xAD;mate of GiveDirectly’s &lt;a href=&quot;https://www.givedirectly.org/poverty-relief/&quot;&gt;Cash for Poverty Relief&lt;/a&gt; pro&#xAD;gram by &lt;a href=&quot;https://www.givedirectly.org/givewell-2024&quot;&gt;3–4x&lt;/a&gt;. &lt;/p&gt;&lt;p&gt;Now, they’re fund&#xAD;ing sev&#xAD;eral pi&#xAD;lots to test if tar&#xAD;geted vari&#xAD;a&#xAD;tions to our flag&#xAD;ship model can fur&#xAD;ther in&#xAD;crease im&#xAD;pact per dol&#xAD;lar for key out&#xAD;comes in the GiveWell model: con&#xAD;sump&#xAD;tion gains, and lo&#xAD;cal eco&#xAD;nomic growth.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;GiveWell is fund&#xAD;ing three pi&#xAD;lots test&#xAD;ing vari&#xAD;a&#xAD;tions to in&#xAD;crease cost-effec&#xAD;tive&#xAD;ness 2-3x more&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;Cash is already among the most &lt;a href=&quot;https://www.givedirectly.org/research-on-cash-transfers/&quot;&gt;rigor&#xAD;ously stud&#xAD;ied&lt;/a&gt; and cost-effec&#xAD;tive in&#xAD;ter&#xAD;ven&#xAD;tions for poverty re&#xAD;duc&#xAD;tion. More than a decade of ev&#xAD;i&#xAD;dence shows that large, one-time cash in&#xAD;creases how much peo&#xAD;ple live on (con&#xAD;sump&#xAD;tion), and gen&#xAD;er&#xAD;ates &lt;a href=&quot;https://www.givedirectly.org/multiplier/&quot;&gt;lo&#xAD;cal eco&#xAD;nomic growth&lt;/a&gt;. &lt;/p&gt;&lt;p&gt;To&#xAD;day we’re ask&#xAD;ing: can we make our cash pro&#xAD;gram &lt;i&gt;even more&lt;/i&gt; cost-effec&#xAD;tive by in&#xAD;creas&#xAD;ing im&#xAD;pact in these ar&#xAD;eas? With &lt;a href=&quot;https://www.givewell.org/research/grants/givedirectly-scoping-grant-for-new-program-variations-march-2025&quot;&gt;fund&#xAD;ing from GiveWell&lt;/a&gt;, we are run&#xAD;ning three pi&#xAD;lots in 2026, each de&#xAD;signed to test a spe&#xAD;cific hy&#xAD;poth&#xAD;e&#xAD;sis about how to in&#xAD;crease to&#xAD;tal im&#xAD;pact per dol&#xAD;lar de&#xAD;liv&#xAD;ered.&lt;/p&gt;&lt;figure&gt;&lt;img src=&quot;https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/kbukaqemiotcsnx9kocz&quot; alt=&quot;&quot; srcset=&quot;https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/kbukaqemiotcsnx9kocz 1361w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/m1sssqjdmk3093csxdeg 300w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/oa0tyus2v9gupzvtvht9 1024w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/qhtvb0bivdwfrcskm26t 768w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/khihwyvsqtak4znbxhzn 873w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/re9tdl2sn24298mizlkz 705w&quot; loading=&quot;lazy&quot;&gt;&lt;/figure&gt;&lt;p&gt;Th&#xAD;ese pi&#xAD;lots will help us an&#xAD;swer three key ques&#xAD;tions that we have:&lt;/p&gt;&lt;p&gt;&lt;strong&gt;1. ⚒️ Can cash for small busi&#xAD;nesses (‘sup&#xAD;ply-side cash’) in&#xAD;crease lo&#xAD;cal eco&#xAD;nomic growth?&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Large, one-time cash given to in&#xAD;di&#xAD;vi&#xAD;d&#xAD;u&#xAD;als in&#xAD;creases lo&#xAD;cal de&#xAD;mand for goods and ser&#xAD;vices. This in turn in&#xAD;creases lo&#xAD;cal busi&#xAD;ness rev&#xAD;enues and gen&#xAD;er&#xAD;ates eco&#xAD;nomic ‘spillovers’. Re&#xAD;search on our Kenya pro&#xAD;gram found that each dol&#xAD;lar given to poor fam&#xAD;i&#xAD;lies gen&#xAD;er&#xAD;ated a &lt;a href=&quot;https://www.givedirectly.org/multiplier/&quot;&gt;$2.50&lt;/a&gt; in&#xAD;crease in lo&#xAD;cal eco&#xAD;nomic ac&#xAD;tivity with min&#xAD;i&#xAD;mal price in&#xAD;fla&#xAD;tion – a key driver in GiveWell’s re&#xAD;cent &lt;a href=&quot;https://www.givedirectly.org/givewell-2024&quot;&gt;in&#xAD;creased as&#xAD;sess&#xAD;ment&lt;/a&gt; in our cost effec&#xAD;tive&#xAD;ness. &lt;/p&gt;&lt;p&gt;In this new &lt;a href=&quot;https://www.givedirectly.org/district-scale/&quot;&gt;Malawi pi&#xAD;lot&lt;/a&gt;, we are test&#xAD;ing if we can am&#xAD;plify these spillover effects by giv&#xAD;ing cash to lo&#xAD;cal busi&#xAD;nesses like hard&#xAD;ware stores, mills, or gro&#xAD;cers, what we’re call&#xAD;ing ‘sup&#xAD;ply-side cash’. We’re giv&#xAD;ing $550 – $1,100&lt;span class=&quot;footnote-reference&quot; data-footnote-reference=&quot;&quot; data-footnote-index=&quot;1&quot; data-footnote-id=&quot;ebp39icb7o8&quot; role=&quot;doc-noteref&quot; id=&quot;fnrefebp39icb7o8&quot;&gt;&lt;sup&gt;&lt;a href=&quot;#fnebp39icb7o8&quot;&gt;[1]&lt;/a&gt;&lt;/sup&gt;&lt;/span&gt; to these mer&#xAD;chants one month &lt;i&gt;be&#xAD;fore&lt;/i&gt; lo&#xAD;cal res&#xAD;i&#xAD;dents re&#xAD;ceive cash, while in&#xAD;form&#xAD;ing them of the com&#xAD;ing spike in de&#xAD;mand and pro&#xAD;vid&#xAD;ing plan&#xAD;ning sup&#xAD;port to pre&#xAD;pare. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Our the&#xAD;ory&lt;/strong&gt; is that lo&#xAD;cal busi&#xAD;nesses have lacked suffi&#xAD;cient cap&#xAD;i&#xAD;tal and no&#xAD;tice to quickly meet surg&#xAD;ing de&#xAD;mand from thou&#xAD;sands of fam&#xAD;i&#xAD;lies re&#xAD;ceiv&#xAD;ing GiveDirectly cash, so some po&#xAD;ten&#xAD;tial spillovers have gone un&#xAD;re&#xAD;al&#xAD;ized. &lt;/p&gt;&lt;p&gt;Our pi&#xAD;lot tests if de&#xAD;liv&#xAD;er&#xAD;ing cash and plan&#xAD;ning sup&#xAD;port to busi&#xAD;ness own&#xAD;ers first will re&#xAD;sult in bet&#xAD;ter-stocked, bet&#xAD;ter-staffed busi&#xAD;nesses when fam&#xAD;i&#xAD;lies re&#xAD;ceive their cash. More goods available lo&#xAD;cally, lower price in&#xAD;fla&#xAD;tion from bet&#xAD;ter sup&#xAD;ply, and higher rev&#xAD;enues would am&#xAD;plify the con&#xAD;sump&#xAD;tion gains and eco&#xAD;nomic spillovers that make cash so cost-effec&#xAD;tive.&lt;/p&gt;&lt;p&gt;We have started pi&#xAD;lot&#xAD;ing and already have early op&#xAD;er&#xAD;a&#xAD;tional learn&#xAD;ing that is in&#xAD;form&#xAD;ing cur&#xAD;rent and fu&#xAD;ture work.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;2. 🚧 Can cash af&#xAD;ter trail bridge con&#xAD;struc&#xAD;tion ac&#xAD;cel&#xAD;er&#xAD;ate and am&#xAD;plify gains from mar&#xAD;ket ac&#xAD;cess?&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Trail bridges con&#xAD;nect&#xAD;ing pre&#xAD;vi&#xAD;ously iso&#xAD;lated com&#xAD;mu&#xAD;ni&#xAD;ties to mar&#xAD;kets are a &lt;a href=&quot;https://www.givewell.org/international/technical/programs/bridges-to-prosperity#:~:text=Bridges%20to%20Prosperity%20is%20a,schools%2C%20health%20clinics%20and%20markets&quot;&gt;cost-effec&#xAD;tive way&lt;/a&gt; to gen&#xAD;er&#xAD;ate pos&#xAD;i&#xAD;tive eco&#xAD;nomic im&#xAD;pacts in com&#xAD;mu&#xAD;ni&#xAD;ties. They open up new op&#xAD;por&#xAD;tu&#xAD;ni&#xAD;ties through phys&#xAD;i&#xAD;cal ac&#xAD;cess to cus&#xAD;tomers, liveli&#xAD;hoods, and new goods and ser&#xAD;vices. &lt;/p&gt;&lt;p&gt;In a pi&#xAD;lot in Uganda, we’re test&#xAD;ing if the eco&#xAD;nomic gains from a trail bridge con&#xAD;structed by &lt;a href=&quot;https://fika.org/&quot;&gt;Fika&lt;/a&gt; in part&#xAD;ner&#xAD;ship with the Ministry of Works and Trans&#xAD;port can be am&#xAD;plified by giv&#xAD;ing cash (~$644) to peo&#xAD;ple in pre&#xAD;vi&#xAD;ously iso&#xAD;lated villages shortly af&#xAD;ter the trail bridge is con&#xAD;structed. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Our the&#xAD;ory&lt;/strong&gt; is that this cash will help villagers to take greater ad&#xAD;van&#xAD;tage of what trail bridge ac&#xAD;cess un&#xAD;locks: new goods and ser&#xAD;vices, new cus&#xAD;tomers, and new jobs. They should also be able to act on these new op&#xAD;por&#xAD;tu&#xAD;ni&#xAD;ties from the trail bridge faster – in&#xAD;vest&#xAD;ing in liveli&#xAD;hoods, buy&#xAD;ing agri&#xAD;cul&#xAD;tural in&#xAD;puts, or ac&#xAD;quiring as&#xAD;sets they couldn’t pre&#xAD;vi&#xAD;ously af&#xAD;ford. &lt;/p&gt;&lt;p&gt;To&#xAD;gether, these effects should am&#xAD;plify the gains of trail bridge con&#xAD;struc&#xAD;tion. We are test&#xAD;ing if this will lead to mean&#xAD;ingful con&#xAD;sump&#xAD;tion in&#xAD;creases within the first few months and a larger over&#xAD;all im&#xAD;pact than trail bridge ac&#xAD;cess alone would gen&#xAD;er&#xAD;ate.&lt;/p&gt;&lt;figure&gt;&lt;img src=&quot;https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/rmt2f6dhmd62vovta1lk&quot; alt=&quot;Tokwe trail bridge in Uganda.&quot; srcset=&quot;https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/rmt2f6dhmd62vovta1lk 1707w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/udalvdjg8bmaxa98rqor 200w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/y3m2ytu05j9uhwxk2xmz 683w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/bu8gv67hfecwow1mzuqs 768w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/twton1no9erkcg0clvzw 1024w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/b1jcbrg3vfivlpmc8zgd 1365w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/amyyn1yqtfauobkjjc9t 400w&quot; loading=&quot;lazy&quot;&gt;&lt;/figure&gt;&lt;p&gt;&lt;i&gt;Tokwe trail bridge in Uganda ☝️&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;3. 🎯 Can cash for the very poor&#xAD;est gen&#xAD;er&#xAD;ate eco&#xAD;nomic gains for those who benefit most&lt;/strong&gt;?&lt;/p&gt;&lt;p&gt;GiveWell’s cost-effec&#xAD;tive&#xAD;ness model &lt;a href=&quot;https://www.givewell.org/international/technical/programs/givedirectly-cash-for-poverty-relief-program#:~:text=How%20poor%20is%20the%20average%20Cash%20for%20Poverty%20Relief%20recipient%3F&quot;&gt;val&#xAD;ues&lt;/a&gt; eco&#xAD;nomic gains for the poor&#xAD;est more heav&#xAD;ily than for those rel&#xAD;a&#xAD;tively bet&#xAD;ter off:&lt;/p&gt;&lt;p&gt;“&lt;i&gt;We value con&#xAD;sump&#xAD;tion gains rel&#xAD;a&#xAD;tively… Con&#xAD;cretely, this means we think a $1,000 an&#xAD;nual con&#xAD;sump&#xAD;tion gain for some&#xAD;one con&#xAD;sum&#xAD;ing $1,000 worth of goods and ser&#xAD;vices a year is more valuable than a $1,000 gain for some&#xAD;one con&#xAD;sum&#xAD;ing $2,000 worth a year. This means it mat&#xAD;ters how poor we think re&#xAD;cip&#xAD;i&#xAD;ents are be&#xAD;fore they re&#xAD;ceive cash trans&#xAD;fers.&lt;/i&gt;” – &lt;a href=&quot;https://www.givewell.org/international/technical/programs/givedirectly-cash-for-poverty-relief-program#:~:text=How%20poor%20is%20the%20average%20Cash%20for%20Poverty%20Relief%20recipient%3F&quot;&gt;GiveWell&lt;/a&gt;&lt;/p&gt;&lt;p&gt;This sug&#xAD;gests there’s a ‘de&#xAD;clin&#xAD;ing marginal util&#xAD;ity of cash’. The very poor ($1,000/​year) fam&#xAD;ily may, for ex&#xAD;am&#xAD;ple, use the money to avoid go&#xAD;ing hun&#xAD;gry. Whereas a less poor ($2,000/​year) fam&#xAD;ily may use the same money to di&#xAD;ver&#xAD;sify their diet. Both are im&#xAD;por&#xAD;tant out&#xAD;comes, but to GiveWell the former is more im&#xAD;por&#xAD;tant.&lt;/p&gt;&lt;p&gt;How&#xAD;ever, less poor fam&#xAD;i&#xAD;lies might be &lt;a href=&quot;https://www.aeaweb.org/articles?id=10.1257/aer.20221650&quot;&gt;bet&#xAD;ter po&#xAD;si&#xAD;tioned&lt;/a&gt; to in&#xAD;vest in in&#xAD;come-gen&#xAD;er&#xAD;at&#xAD;ing ac&#xAD;tivi&#xAD;ties than the very poor&#xAD;est, so may make longer-term gains. A pro&#xAD;gram that op&#xAD;ti&#xAD;mizes for boost&#xAD;ing the very poor&#xAD;est in the short-term &lt;i&gt;and&lt;/i&gt; en&#xAD;courag&#xAD;ing in&#xAD;come-gen&#xAD;er&#xAD;at&#xAD;ing in&#xAD;vest&#xAD;ments that sus&#xAD;tain them in the long-term would be max&#xAD;i&#xAD;mally cost-effec&#xAD;tive.&lt;/p&gt;&lt;p&gt;Our new Mozam&#xAD;bique pi&#xAD;lot is do&#xAD;ing just that: test&#xAD;ing if tar&#xAD;get&#xAD;ing the very poor&#xAD;est com&#xAD;mu&#xAD;ni&#xAD;ties within poor ar&#xAD;eas can lead to higher GiveWell cost-effec&#xAD;tive&#xAD;ness from greater rel&#xAD;a&#xAD;tive con&#xAD;sump&#xAD;tion gains in both the short &lt;i&gt;and&lt;/i&gt; longer term. &lt;/p&gt;&lt;p&gt;First, we’re us&#xAD;ing &lt;a href=&quot;https://www.atlasai.co/&quot;&gt;At&#xAD;lasAI&lt;/a&gt;’s geospa&#xAD;tial tar&#xAD;get&#xAD;ing tech&#xAD;nol&#xAD;ogy to iden&#xAD;tify the very poor&#xAD;est com&#xAD;mu&#xAD;ni&#xAD;ties in Mozam&#xAD;bique – those liv&#xAD;ing on just over $1 per day. Then we’ll give ~$550 to ev&#xAD;ery adult un&#xAD;der 35 and to the head of ev&#xAD;ery house&#xAD;hold in these poor&#xAD;est villages. &lt;/p&gt;&lt;p&gt;Be&#xAD;cause we’re tar&#xAD;get&#xAD;ing the very poor&#xAD;est, more of this cash may be spent on short term con&#xAD;sump&#xAD;tion gains in&#xAD;stead of ma&#xAD;jor in&#xAD;vest&#xAD;ments which might re&#xAD;duce longer term gains. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Our the&#xAD;ory&lt;/strong&gt; is that we can coun&#xAD;ter&#xAD;act this by tar&#xAD;get&#xAD;ing young adults in the poor&#xAD;est ar&#xAD;eas in ad&#xAD;di&#xAD;tion to older heads of house&#xAD;holds, be&#xAD;cause our pro&#xAD;gram ev&#xAD;i&#xAD;dence and &lt;a href=&quot;https://www.sciencedirect.com/science/article/pii/S0304387824001962&quot;&gt;other stud&#xAD;ies&lt;/a&gt; show that young adults are more likely to in&#xAD;vest in in&#xAD;come-gen&#xAD;er&#xAD;at&#xAD;ing ac&#xAD;tivi&#xAD;ties that gen&#xAD;er&#xAD;ate long term gains.&lt;/p&gt;&lt;p&gt;&lt;i&gt;&lt;strong&gt;Learn how GeoAI-based iden&#xAD;ti&#xAD;fies those with the low&#xAD;est con&#xAD;sump&#xAD;tion lev&#xAD;els &lt;/strong&gt;👇&lt;/i&gt;&lt;/p&gt;&lt;figure&gt;&lt;img src=&quot;https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/mvwz7zbg4ebkbxk8ryuc&quot; alt=&quot;&quot; srcset=&quot;https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/mvwz7zbg4ebkbxk8ryuc 1024w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/ehepmb0tpnr6jddydy48 300w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/fx0ccapqnhguehbyuuok 768w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/hgrwp8yddwsrintvjc1f 900w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/eocm0lfdjnzbwkq9zd8j 705w, https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/ReLkx3hyeiJX2QaBa/dirglcsqhnowayrjjn9q 1363w&quot; loading=&quot;lazy&quot;&gt;&lt;/figure&gt;&lt;p&gt;&lt;i&gt;How much peo&#xAD;ple live on varies sub&#xAD;stan&#xAD;tially even within small ge&#xAD;o&#xAD;graphic ar&#xAD;eas but con&#xAD;ven&#xAD;tional gov&#xAD;ern&#xAD;ment cen&#xAD;sus data are of&#xAD;ten out of date and do not cap&#xAD;ture this var&#xAD;i&#xAD;ance well. &lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;i&gt;At&#xAD;lasAI’s geospa&#xAD;tial tech&#xAD;nol&#xAD;ogy com&#xAD;bines ma&#xAD;chine learn&#xAD;ing, satel&#xAD;lite imagery and sur&#xAD;vey mea&#xAD;sure&#xAD;ments to es&#xAD;ti&#xAD;mate &lt;/i&gt;&lt;a href=&quot;https://docs.atlasai.co/economic%20well-being/spending/&quot;&gt;&lt;i&gt;av&#xAD;er&#xAD;age house&#xAD;hold spend&#xAD;ing&lt;/i&gt;&lt;/a&gt;&lt;i&gt; in and across small villages. We have used this to iden&#xAD;tify the very poor&#xAD;est ar&#xAD;eas within Lalaua, Mozam&#xAD;bique and ver&#xAD;ified the ac&#xAD;cu&#xAD;racy of the At&#xAD;lasAI es&#xAD;ti&#xAD;mates through in-per&#xAD;son field vis&#xAD;its.&lt;/i&gt;&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;We se&#xAD;lected these pi&#xAD;lots for their po&#xAD;ten&#xAD;tial to sub&#xAD;stan&#xAD;tially in&#xAD;crease cost-effec&#xAD;tive&#xAD;ness at scale. &lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;Each pi&#xAD;lot was se&#xAD;lected for two rea&#xAD;sons. They in&#xAD;volve pro&#xAD;gram tweaks with clear mechanisms to plau&#xAD;si&#xAD;bly in&#xAD;crease our im&#xAD;pact per dol&#xAD;lar by GiveWell’s stan&#xAD;dards. And if re&#xAD;sults val&#xAD;i&#xAD;date a greater im&#xAD;pact, they can be scaled rapidly and effi&#xAD;ciently.  &lt;/p&gt;&lt;p&gt;Both GiveDirectly and &lt;a href=&quot;https://www.givewell.org/research/grants/givedirectly-scoping-grant-for-new-program-variations-march-2025&quot;&gt;GiveWell&lt;/a&gt; be&#xAD;lieve these pro&#xAD;gram vari&#xAD;a&#xAD;tions could gen&#xAD;er&#xAD;ate 2–3x the im&#xAD;pact per dol&#xAD;lar of GiveDirectly’s flag&#xAD;ship cash trans&#xAD;fer pro&#xAD;gram as mea&#xAD;sured by GiveWell’s &lt;a href=&quot;https://www.givewell.org/how-we-work/our-criteria/cost-effectiveness/cost-effectiveness-models&quot;&gt;frame&#xAD;work&lt;/a&gt;. Th&#xAD;ese es&#xAD;ti&#xAD;mates are ev&#xAD;i&#xAD;dence-driven but ul&#xAD;ti&#xAD;mately un&#xAD;cer&#xAD;tain, which is ex&#xAD;actly why we are test&#xAD;ing them. &lt;/p&gt;&lt;p&gt;We have already be&#xAD;gun im&#xAD;ple&#xAD;men&#xAD;ta&#xAD;tion of sup&#xAD;ply side cash in Malawi, and will launch the re&#xAD;main&#xAD;ing pi&#xAD;lots by June.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;GiveDirectly &amp;amp; GiveWell con&#xAD;tinue to share a com&#xAD;mit&#xAD;ment to max&#xAD;i&#xAD;miz&#xAD;ing im&#xAD;pact.&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;We view cost-effec&#xAD;tive&#xAD;ness not as a static num&#xAD;ber, but as some&#xAD;thing we are con&#xAD;tin&#xAD;u&#xAD;ously work&#xAD;ing to in&#xAD;crease. If these mod&#xAD;els out&#xAD;perform our cur&#xAD;rent ap&#xAD;proach, we will scale them. If they do not, we will in&#xAD;cor&#xAD;po&#xAD;rate what we learn and con&#xAD;tinue re&#xAD;fin&#xAD;ing.&lt;/p&gt;&lt;p&gt;Max&#xAD;i&#xAD;miz&#xAD;ing the im&#xAD;pact achieved per dol&#xAD;lar de&#xAD;liv&#xAD;ered is core to our mis&#xAD;sion. Th&#xAD;ese pi&#xAD;lots are the next step in that jour&#xAD;ney.&lt;/p&gt;&lt;p&gt;&lt;i&gt;Note: There are many ways to &lt;/i&gt;&lt;a href=&quot;https://www.povertyactionlab.org/resource/conducting-cost-effectiveness-analysis-cea&quot;&gt;&lt;i&gt;an&#xAD;a&#xAD;lyze cost-effec&#xAD;tive&#xAD;ness in aid&lt;/i&gt;&lt;/a&gt;&lt;i&gt;, de&#xAD;pend&#xAD;ing on which out&#xAD;comes you weigh most highly. Th&#xAD;ese pi&#xAD;lots test how to im&#xAD;prove cash’s cost-effec&#xAD;tive&#xAD;ness as defined by &lt;/i&gt;&lt;a href=&quot;https://www.givewell.org/how-we-work/our-criteria/cost-effectiveness/moral-weights&quot;&gt;&lt;i&gt;GiveWell’s moral weights&lt;/i&gt;&lt;/a&gt;&lt;i&gt;. Else&#xAD;where, for ex&#xAD;am&#xAD;ple, we’re pi&#xAD;lot&#xAD;ing &lt;/i&gt;&lt;a href=&quot;https://www.givedirectly.org/africa-moms-babies/&quot;&gt;&lt;i&gt;cash for new moth&#xAD;ers&lt;/i&gt;&lt;/a&gt;&lt;i&gt;, with a fo&#xAD;cus pri&#xAD;mar&#xAD;ily on ma&#xAD;ter&#xAD;nal and child health out&#xAD;comes.&lt;/i&gt;&lt;br&gt; &lt;/p&gt;&lt;ol class=&quot;footnote-section footnotes&quot; data-footnote-section=&quot;&quot; role=&quot;doc-endnotes&quot;&gt;&lt;li class=&quot;footnote-item&quot; data-footnote-item=&quot;&quot; data-footnote-index=&quot;1&quot; data-footnote-id=&quot;ebp39icb7o8&quot; role=&quot;doc-endnote&quot; id=&quot;fnebp39icb7o8&quot;&gt;&lt;span class=&quot;footnote-back-link&quot; data-footnote-back-link=&quot;&quot; data-footnote-id=&quot;ebp39icb7o8&quot;&gt;&lt;a href=&quot;#fnrefebp39icb7o8&quot;&gt;^&lt;/a&gt;&lt;/span&gt;&lt;div class=&quot;footnote-content&quot; data-footnote-content=&quot;&quot;&gt;&lt;p&gt;We are giv&#xAD;ing $1,100 to busi&#xAD;nesses with per&#xAD;ma&#xAD;nent struc&#xAD;tures (e.g. hair sa&#xAD;lons or gen&#xAD;eral mer&#xAD;chan&#xAD;dise stores)and $550 to those with&#xAD;out (e.g. tai&#xAD;lors or ven&#xAD;dors with&#xAD;out per&#xAD;ma&#xAD;nent stalls). Th&#xAD;ese amounts were calcu&#xAD;lated based on sur&#xAD;vey data col&#xAD;lected in our &lt;a href=&quot;https://www.givedirectly.org/district-scale/&quot;&gt;Phase 1&lt;/a&gt; work in Malawi and ad&#xAD;justed for the high in&#xAD;fla&#xAD;tion seen since data was col&#xAD;lected. They are an es&#xAD;ti&#xAD;mate of the an&#xAD;ti&#xAD;ci&#xAD;pated cap&#xAD;i&#xAD;tal con&#xAD;straints faced by firms to pre&#xAD;pare for de&#xAD;mand surges, and are ad&#xAD;justed for firm type based on ob&#xAD;serv&#xAD;able char&#xAD;ac&#xAD;ter&#xAD;is&#xAD;tics. We an&#xAD;ti&#xAD;ci&#xAD;pate busi&#xAD;nesses with a for&#xAD;mal struc&#xAD;ture re&#xAD;quire more cap&#xAD;i&#xAD;tal to pre&#xAD;pare than those with&#xAD;out struc&#xAD;tures.&lt;/p&gt;&lt;/div&gt;&lt;/li&gt;&lt;/ol&gt;</description>
            <author>GiveDirectly</author>
            <guid>ReLkx3hyeiJX2QaBa</guid>
            <pubDate>Thu, 09 Apr 2026 13:05:15 +0000</pubDate>
        </item>
        <item>
            <title>Enforcement without experience: Military AI and China | Responsible AI in Military Contexts: A Comparative Analysis; Part 2 of 5
 by Slava Kold (Viacheslav Kolodiazhnyi)</title>
            <link>https://forum.nunosempere.com/posts/dj4guht9a4mXu4ijG/enforcement-without-experience-military-ai-and-china-or</link>
            <description>&lt;h2&gt;&lt;strong&gt;Summary&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;This is part two of a five-part se&#xAD;ries ex&#xAD;am&#xAD;in&#xAD;ing how poli&#xAD;ti&#xAD;cal sys&#xAD;tems shape the de&#xAD;vel&#xAD;op&#xAD;ment and de&#xAD;ploy&#xAD;ment of mil&#xAD;i&#xAD;tary AI. This part cov&#xAD;ers China. Part one, cov&#xAD;er&#xAD;ing the Euro&#xAD;pean Union, is available &lt;a href=&quot;/posts/q5KRzWFv3BadojcsX/rules-without-enforcement-military-ai-and-the-european-union&quot;&gt;here&lt;/a&gt;. Parts three through five — the United States, Rus&#xAD;sia, and a com&#xAD;par&#xAD;a&#xAD;tive con&#xAD;clu&#xAD;sion — will fol&#xAD;low.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Introduction&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;The first part of this se&#xAD;ries showed what the “the&#xAD;ory of democ&#xAD;racy” looks like in its purest form: de&#xAD;tailed rules, no en&#xAD;force&#xAD;ment mechanism, and a sys&#xAD;tem that mon&#xAD;i&#xAD;tors its own ac&#xAD;tions more rigor&#xAD;ously than those of its ad&#xAD;ver&#xAD;saries.&lt;/p&gt;&lt;p&gt;China is the op&#xAD;po&#xAD;site case — and in a pre&#xAD;cise sense. It also de&#xAD;clares re&#xAD;spon&#xAD;si&#xAD;ble use of AI, also sup&#xAD;ports hu&#xAD;man con&#xAD;trol in prin&#xAD;ci&#xAD;ple, and also par&#xAD;ti&#xAD;ci&#xAD;pates con&#xAD;struc&#xAD;tively in UN dis&#xAD;cus&#xAD;sions on au&#xAD;tonomous weapons. The differ&#xAD;ence is struc&#xAD;tural: in the Chi&#xAD;nese sys&#xAD;tem, there is no in&#xAD;de&#xAD;pen&#xAD;dent ac&#xAD;tor ca&#xAD;pa&#xAD;ble of hold&#xAD;ing the state to its own de&#xAD;clared com&#xAD;mit&#xAD;ments. The party sets the red lines. The party can move them. There is no ap&#xAD;peal.&lt;/p&gt;&lt;p&gt;This part ex&#xAD;am&#xAD;ines the doc&#xAD;trine of in&#xAD;tel&#xAD;li&#xAD;gen&#xAD;tized war&#xAD;fare, the Mili&#xAD;tary-Civil Fu&#xAD;sion mechanism, the gap be&#xAD;tween China’s de&#xAD;clared UN po&#xAD;si&#xAD;tion and its op&#xAD;er&#xAD;a&#xAD;tional re&#xAD;al&#xAD;ity, and the cen&#xAD;tral para&#xAD;dox of a mil&#xAD;i&#xAD;tary that may be the world’s most tech&#xAD;nolog&#xAD;i&#xAD;cally am&#xAD;bi&#xAD;tious — and has not fought a war since 1979.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;I. Doc&#xAD;trine and In&#xAD;sti&#xAD;tu&#xAD;tional Architecture&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;China en&#xAD;ters this com&#xAD;par&#xAD;a&#xAD;tive anal&#xAD;y&#xAD;sis as the ac&#xAD;tor with the most sys&#xAD;tem&#xAD;at&#xAD;i&#xAD;cally con&#xAD;structed doc&#xAD;trine of mil&#xAD;i&#xAD;tary AI — and the least real-world val&#xAD;i&#xAD;da&#xAD;tion of it. The doc&#xAD;trine of 智能化战争 (zh&#xEC;n&#xE9;nghu&#xE0; zh&#xE0;nzhēng, “in&#xAD;tel&#xAD;li&#xAD;gen&#xAD;tized war&#xAD;fare”) rep&#xAD;re&#xAD;sents the third evolu&#xAD;tion&#xAD;ary stage of Chi&#xAD;nese mil&#xAD;i&#xAD;tary think&#xAD;ing over three decades. The first stage, grounded in 机械化 (mech&#xAD;a&#xAD;niza&#xAD;tion), dom&#xAD;i&#xAD;nated PLA mod&#xAD;ern&#xAD;iza&#xAD;tion through the 1980s and 1990s. The sec&#xAD;ond, 信息化 (in&#xAD;forma&#xAD;ti&#xAD;za&#xAD;tion), drove the in&#xAD;te&#xAD;gra&#xAD;tion of digi&#xAD;tal tech&#xAD;nolo&#xAD;gies into com&#xAD;mand and con&#xAD;trol from the 2000s through the mid-2010s. The third stage, for&#xAD;mally em&#xAD;bed&#xAD;ded in Chi&#xAD;nese strate&#xAD;gic doc&#xAD;u&#xAD;ments from 2019 on&#xAD;ward, shifts the fo&#xAD;cus to ar&#xAD;tifi&#xAD;cial in&#xAD;tel&#xAD;li&#xAD;gence as the pri&#xAD;mary in&#xAD;stru&#xAD;ment of “&lt;a href=&quot;https://www.fpri.org/article/2025/03/ai-dependence-and-political-blind-spots-undermine-beijings-war-strategy/&quot;&gt;cog&#xAD;ni&#xAD;tive su&#xAD;pe&#xAD;ri&#xAD;or&#xAD;ity&lt;/a&gt;” over ad&#xAD;ver&#xAD;saries: the abil&#xAD;ity to per&#xAD;ceive more, de&#xAD;cide faster, and act more pre&#xAD;cisely than any con&#xAD;ceiv&#xAD;able com&#xAD;peti&#xAD;tor . The doc&#xAD;trine’s pri&#xAD;ori&#xAD;ties are dis&#xAD;tributed across four func&#xAD;tional di&#xAD;rec&#xAD;tions: au&#xAD;tonomous sys&#xAD;tems, par&#xAD;tic&#xAD;u&#xAD;larly drone swarms; AI in com&#xAD;mand and con&#xAD;trol — de&#xAD;ci&#xAD;sion-sup&#xAD;port sys&#xAD;tems for rapid OODA cy&#xAD;cle traver&#xAD;sal; real-time in&#xAD;tel&#xAD;li&#xAD;gence and data fu&#xAD;sion; and cog&#xAD;ni&#xAD;tive war&#xAD;fare, mean&#xAD;ing the use of AI for in&#xAD;for&#xAD;ma&#xAD;tion op&#xAD;er&#xAD;a&#xAD;tions, ad&#xAD;ver&#xAD;sary per&#xAD;cep&#xAD;tion man&#xAD;age&#xAD;ment, and nar&#xAD;ra&#xAD;tive con&#xAD;trol.&lt;/p&gt;&lt;p&gt;The in&#xAD;sti&#xAD;tu&#xAD;tional mechanism for im&#xAD;ple&#xAD;ment&#xAD;ing the doc&#xAD;trine is Mili&#xAD;tary-Civil Fu&#xAD;sion (MCF, 军民融合), for&#xAD;mally en&#xAD;shrined as a state strat&#xAD;egy in 2017. MCF is a sys&#xAD;tem un&#xAD;der which pri&#xAD;vate com&#xAD;pa&#xAD;nies, uni&#xAD;ver&#xAD;si&#xAD;ties, and re&#xAD;search in&#xAD;sti&#xAD;tutes are obli&#xAD;gated to par&#xAD;ti&#xAD;ci&#xAD;pate in defence de&#xAD;vel&#xAD;op&#xAD;ment, share tech&#xAD;nolo&#xAD;gies with the mil&#xAD;i&#xAD;tary, and em&#xAD;bed defence re&#xAD;quire&#xAD;ments into civilian prod&#xAD;ucts. This is nei&#xAD;ther a mar&#xAD;ket part&#xAD;ner&#xAD;ship in the Western sense nor a Soviet-style com&#xAD;mand econ&#xAD;omy — it is a third model: the state di&#xAD;rects the re&#xAD;sources of the civilian in&#xAD;no&#xAD;va&#xAD;tion sec&#xAD;tor into mil&#xAD;i&#xAD;tary chan&#xAD;nels with&#xAD;out for&#xAD;mally na&#xAD;tion&#xAD;al&#xAD;is&#xAD;ing the com&#xAD;pa&#xAD;nies them&#xAD;selves.&lt;/p&gt;&lt;p&gt;The key el&#xAD;e&#xAD;ment of MCF is the con&#xAD;cept of dual-use, in&#xAD;ter&#xAD;preted in China in a fun&#xAD;da&#xAD;men&#xAD;tally differ&#xAD;ent way from the West. In the Euro&#xAD;pean frame&#xAD;work, dual-use means that a civilian tech&#xAD;nol&#xAD;ogy may have mil&#xAD;i&#xAD;tary ap&#xAD;pli&#xAD;ca&#xAD;tions — and this cre&#xAD;ates a reg&#xAD;u&#xAD;la&#xAD;tory prob&#xAD;lem, hence the ex&#xAD;clu&#xAD;sions in the AI Act. In the Chi&#xAD;nese frame&#xAD;work, dual-use means there is no dis&#xAD;tinc&#xAD;tion be&#xAD;tween civilian and mil&#xAD;i&#xAD;tary ap&#xAD;pli&#xAD;ca&#xAD;tion from the out&#xAD;set: any de&#xAD;vel&#xAD;op&#xAD;ment in AI, un&#xAD;manned sys&#xAD;tems, image recog&#xAD;ni&#xAD;tion, or sig&#xAD;nal pro&#xAD;cess&#xAD;ing is treated as po&#xAD;ten&#xAD;tially mil&#xAD;i&#xAD;tary from the mo&#xAD;ment of its cre&#xAD;ation. This elimi&#xAD;nates the fric&#xAD;tion be&#xAD;tween civilian and defence sec&#xAD;tors that is well-doc&#xAD;u&#xAD;mented in the Euro&#xAD;pean case — but it also means that China’s en&#xAD;tire civilian AI sec&#xAD;tor is de facto part of the defence in&#xAD;dus&#xAD;trial base.&lt;/p&gt;&lt;p&gt;The fi&#xAD;nan&#xAD;cial scale of this sys&#xAD;tem is sub&#xAD;stan&#xAD;tial, though pre&#xAD;cise figures are difficult to ver&#xAD;ify given the struc&#xAD;ture of Chi&#xAD;nese bud&#xAD;get re&#xAD;port&#xAD;ing. Pen&#xAD;tagon es&#xAD;ti&#xAD;mates place Chi&#xAD;nese mil&#xAD;i&#xAD;tary AI ex&#xAD;pen&#xAD;di&#xAD;ture as com&#xAD;pa&#xAD;rable to Amer&#xAD;i&#xAD;can figures — around $1.5–2 billion an&#xAD;nu&#xAD;ally in di&#xAD;rect pro&#xAD;grammes, not count&#xAD;ing the enor&#xAD;mous in&#xAD;vest&#xAD;ments chan&#xAD;nel&#xAD;led through MCF that tech&#xAD;ni&#xAD;cally ap&#xAD;pear as civilian spend&#xAD;ing. The ag&#xAD;gre&#xAD;gate state and pri&#xAD;vate AI mar&#xAD;ket in China &lt;a href=&quot;https://moderndiplomacy.eu/2025/10/06/great-power-competition-in-ai-led-driven-warfare-between-the-us-and-china/&quot;&gt;reached $9.3 billion in in&#xAD;vest&#xAD;ment in 2024&lt;/a&gt; . For com&#xAD;par&#xAD;i&#xAD;son, the United States in&#xAD;vested ap&#xAD;prox&#xAD;i&#xAD;mately $109 billion in pri&#xAD;vate AI in 2024 — a gap of more than ten to one. Direct com&#xAD;par&#xAD;i&#xAD;son is mis&#xAD;lead&#xAD;ing, how&#xAD;ever: the Chi&#xAD;nese sys&#xAD;tem redi&#xAD;rects state re&#xAD;sources into the mil&#xAD;i&#xAD;tary do&#xAD;main through MCF in ways that do not ap&#xAD;pear in di&#xAD;rect defence statis&#xAD;tics.&lt;/p&gt;&lt;p&gt;The speed at which China is mov&#xAD;ing in this do&#xAD;main is partly ex&#xAD;plained by a struc&#xAD;tural pat&#xAD;tern es&#xAD;tab&#xAD;lished since the era of Deng Xiaop&#xAD;ing: the ca&#xAD;pac&#xAD;ity to take for&#xAD;eign tech&#xAD;nolo&#xAD;gies, adapt them to Chi&#xAD;nese con&#xAD;di&#xAD;tions, and scale them with state sup&#xAD;port faster than the origi&#xAD;na&#xAD;tors of those tech&#xAD;nolo&#xAD;gies can re&#xAD;spond. In the 1980s this prin&#xAD;ci&#xAD;ple ap&#xAD;plied to man&#xAD;u&#xAD;fac&#xAD;tur&#xAD;ing tech&#xAD;nolo&#xAD;gies, in the 1990s to telecom&#xAD;mu&#xAD;ni&#xAD;ca&#xAD;tions, in the 2000s to in&#xAD;ter&#xAD;net plat&#xAD;forms. In the 2020s it ap&#xAD;plies to AI: Chi&#xAD;nese com&#xAD;pa&#xAD;nies have trained mod&#xAD;els ex&#xAD;ten&#xAD;sively on open Western ar&#xAD;chi&#xAD;tec&#xAD;tures — trans&#xAD;form&#xAD;ers, diffu&#xAD;sion mod&#xAD;els — adapted them to Chi&#xAD;nese lin&#xAD;guis&#xAD;tic con&#xAD;texts and the spe&#xAD;cific de&#xAD;mands of defence tasks, and in&#xAD;te&#xAD;grated them into mil&#xAD;i&#xAD;tary sys&#xAD;tems through MCF. Deep&#xAD;Seek be&#xAD;came the clear&#xAD;est em&#xAD;bod&#xAD;i&#xAD;ment of this pat&#xAD;tern: a model built on prin&#xAD;ci&#xAD;ples de&#xAD;vel&#xAD;oped largely by Western re&#xAD;searchers, but im&#xAD;ple&#xAD;mented with fun&#xAD;da&#xAD;men&#xAD;tally differ&#xAD;ent com&#xAD;pu&#xAD;ta&#xAD;tional effi&#xAD;ciency.&lt;/p&gt;&lt;p&gt;Be&#xAD;hind this lies a philo&#xAD;soph&#xAD;i&#xAD;cally dis&#xAD;tinct ap&#xAD;proach to AI ap&#xAD;pli&#xAD;ca&#xAD;tion. Western lab&#xAD;o&#xAD;ra&#xAD;to&#xAD;ries — OpenAI, An&#xAD;thropic, Deep&#xAD;Mind — or&#xAD;ganise their ex&#xAD;is&#xAD;tence around the am&#xAD;bi&#xAD;tion of cre&#xAD;at&#xAD;ing ar&#xAD;tifi&#xAD;cial gen&#xAD;eral in&#xAD;tel&#xAD;li&#xAD;gence (AGI): a sys&#xAD;tem ca&#xAD;pa&#xAD;ble of solv&#xAD;ing any task a hu&#xAD;man can solve. This is both a com&#xAD;mer&#xAD;cial strat&#xAD;egy and a re&#xAD;search pro&#xAD;gramme, and to some de&#xAD;gree a wor&#xAD;ld&#xAD;view. China’s mil&#xAD;i&#xAD;tary AI strat&#xAD;egy is fun&#xAD;da&#xAD;men&#xAD;tally util&#xAD;i&#xAD;tar&#xAD;ian: not to cre&#xAD;ate a uni&#xAD;ver&#xAD;sal in&#xAD;tel&#xAD;li&#xAD;gence, but to de&#xAD;ploy spe&#xAD;cial&#xAD;ised sys&#xAD;tems that solve spe&#xAD;cific op&#xAD;er&#xAD;a&#xAD;tional tasks faster and more re&#xAD;li&#xAD;ably than a hu&#xAD;man can. Tar&#xAD;get iden&#xAD;ti&#xAD;fi&#xAD;ca&#xAD;tion, satel&#xAD;lite data pro&#xAD;cess&#xAD;ing, drone swarm man&#xAD;age&#xAD;ment, de&#xAD;ci&#xAD;sion sup&#xAD;port in the OODA cy&#xAD;cle — each sys&#xAD;tem is op&#xAD;ti&#xAD;mised for a spe&#xAD;cific func&#xAD;tion. This ap&#xAD;proach is faster to de&#xAD;velop, cheaper to de&#xAD;ploy, and less vuln&#xAD;er&#xAD;a&#xAD;ble to sin&#xAD;gle-com&#xAD;po&#xAD;nent failure. Its limi&#xAD;ta&#xAD;tion is that spe&#xAD;cial&#xAD;ised sys&#xAD;tems perform poorly in situ&#xAD;a&#xAD;tions for which they were not de&#xAD;signed — and real war is full of ex&#xAD;actly those situ&#xAD;a&#xAD;tions.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;II. De&#xAD;clared Po&#xAD;si&#xAD;tion and the Gap with Reality&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;On the floor of the United Na&#xAD;tions and in in&#xAD;ter&#xAD;na&#xAD;tional fo&#xAD;rums, China oc&#xAD;cu&#xAD;pies a po&#xAD;si&#xAD;tion that ap&#xAD;pears among the most con&#xAD;struc&#xAD;tive of the ma&#xAD;jor mil&#xAD;i&#xAD;tary pow&#xAD;ers. In Oc&#xAD;to&#xAD;ber 2025, at the 80th ses&#xAD;sion of the UN Gen&#xAD;eral Assem&#xAD;bly First Com&#xAD;mit&#xAD;tee, the Chi&#xAD;nese del&#xAD;e&#xAD;ga&#xAD;tion again af&#xAD;firmed its sup&#xAD;port for ne&#xAD;go&#xAD;ti&#xAD;a&#xAD;tions to&#xAD;ward a legally bind&#xAD;ing in&#xAD;stru&#xAD;ment on LAWS — “&lt;a href=&quot;https://un.china-mission.gov.cn/eng/chinaandun/disarmament_armscontrol/202510/t20251024_11739691.htm&quot;&gt;when con&#xAD;di&#xAD;tions are ripe&lt;/a&gt;” . In De&#xAD;cem&#xAD;ber 2025, China ab&#xAD;stained on UN Gen&#xAD;eral Assem&#xAD;bly Re&#xAD;s&#xAD;olu&#xAD;tion &lt;span class=&quot;frac&quot;&gt;&lt;sup&gt;80&lt;/sup&gt;⁄&lt;sub&gt;57&lt;/sub&gt;&lt;/span&gt; &lt;a href=&quot;https://automatedresearch.org/news/state_position/china/&quot;&gt;on lethal au&#xAD;tonomous sys&#xAD;tems&lt;/a&gt; . For&#xAD;mally, China is the only ma&#xAD;jor mil&#xAD;i&#xAD;tary power to de&#xAD;clare sup&#xAD;port for a bind&#xAD;ing LAWS treaty — un&#xAD;like the United States and Rus&#xAD;sia, which make no such dec&#xAD;la&#xAD;ra&#xAD;tion. &lt;/p&gt;&lt;p&gt;The key to un&#xAD;der&#xAD;stand&#xAD;ing this po&#xAD;si&#xAD;tion lies in what China ac&#xAD;tu&#xAD;ally con&#xAD;sid&#xAD;ers “un&#xAD;ac&#xAD;cept&#xAD;able” LAWS. In a work&#xAD;ing pa&#xAD;per sub&#xAD;mit&#xAD;ted to the CCW GGE in 2022 and reaf&#xAD;firmed through the 2024–2025 ses&#xAD;sions, China pro&#xAD;posed a five-crite&#xAD;ria defi&#xAD;ni&#xAD;tion: a sys&#xAD;tem is &lt;a href=&quot;https://www.congress.gov/crs-product/IF11294&quot;&gt;an un&#xAD;ac&#xAD;cept&#xAD;able LAWS&lt;/a&gt; only if it is si&#xAD;mul&#xAD;ta&#xAD;neously (1) lethal, (2) au&#xAD;tonomous, (3) in&#xAD;ca&#xAD;pable of be&#xAD;ing stopped once launched, (4) ca&#xAD;pa&#xAD;ble of kil&#xAD;ling in&#xAD;dis&#xAD;crim&#xAD;i&#xAD;nately, and (5) ca&#xAD;pa&#xAD;ble of au&#xAD;tonomous learn&#xAD;ing . All five crite&#xAD;ria must be satis&#xAD;fied si&#xAD;mul&#xAD;ta&#xAD;neously. An&#xAD;a&#xAD;lysts at the &lt;a href=&quot;https://lieber.westpoint.edu/human-oversight-chinese-characteristics-lethal-autonomous-weapons-ccw-gge/&quot;&gt;Lie&#xAD;ber In&#xAD;sti&#xAD;tute at West Point&lt;/a&gt;&lt;u&gt;&lt;/u&gt;char&#xAD;ac&#xAD;ter&#xAD;ised this po&#xAD;si&#xAD;tion di&#xAD;rectly: by in&#xAD;sist&#xAD;ing that a sys&#xAD;tem must ex&#xAD;hibit all five char&#xAD;ac&#xAD;ter&#xAD;is&#xAD;tics at once be&#xAD;fore it faces out&#xAD;right pro&#xAD;hi&#xAD;bi&#xAD;tion, China has drawn a re&#xAD;mark&#xAD;ably nar&#xAD;row line for what counts as un&#xAD;ac&#xAD;cept&#xAD;able. This cu&#xAD;mu&#xAD;la&#xAD;tive thresh&#xAD;old, un&#xAD;changed in Beijing’s con&#xAD;tri&#xAD;bu&#xAD;tions through the 2025 GGE ses&#xAD;sions, effec&#xAD;tively ex&#xAD;cludes a wide range of emerg&#xAD;ing au&#xAD;tonomous ca&#xAD;pa&#xAD;bil&#xAD;ities, many of which Beijing is de&#xAD;vel&#xAD;op&#xAD;ing . In other words, a sys&#xAD;tem with a high de&#xAD;gree of au&#xAD;ton&#xAD;omy, ca&#xAD;pa&#xAD;ble of se&#xAD;lect&#xAD;ing and en&#xAD;gag&#xAD;ing tar&#xAD;gets with&#xAD;out mean&#xAD;ingful hu&#xAD;man in&#xAD;volve&#xAD;ment, can eas&#xAD;ily fail to satisfy at least one of the five con&#xAD;di&#xAD;tions — and there&#xAD;fore re&#xAD;main out&#xAD;side the pro&#xAD;hi&#xAD;bi&#xAD;tion.&lt;/p&gt;&lt;p&gt;Why does this gap not pro&#xAD;voke a sharper in&#xAD;ter&#xAD;na&#xAD;tional re&#xAD;ac&#xAD;tion? Sev&#xAD;eral rea&#xAD;sons op&#xAD;er&#xAD;ate si&#xAD;mul&#xAD;ta&#xAD;neously. First, the CCW GGE ne&#xAD;go&#xAD;ti&#xAD;at&#xAD;ing pro&#xAD;cess is con&#xAD;sen&#xAD;sus-based, mean&#xAD;ing any state has an effec&#xAD;tive veto and the pro&#xAD;cess gen&#xAD;er&#xAD;ally pro&#xAD;duces con&#xAD;sen&#xAD;sus at the low&#xAD;est com&#xAD;mon de&#xAD;nom&#xAD;i&#xAD;na&#xAD;tor. Se&#xAD;cond, Western states — above all the United States — also have no in&#xAD;ter&#xAD;est in a rigor&#xAD;ous bind&#xAD;ing LAWS treaty, which de&#xAD;prives them of the moral au&#xAD;thor&#xAD;ity to crit&#xAD;i&#xAD;cise China’s po&#xAD;si&#xAD;tion. Third, the for&#xAD;mu&#xAD;la&#xAD;tion “when con&#xAD;di&#xAD;tions are ripe” al&#xAD;lows China to demon&#xAD;strate con&#xAD;struc&#xAD;tive&#xAD;ness in ev&#xAD;ery diplo&#xAD;matic round with&#xAD;out ac&#xAD;cept&#xAD;ing any real obli&#xAD;ga&#xAD;tions, since “con&#xAD;di&#xAD;tions ripen&#xAD;ing” never oc&#xAD;curs un&#xAD;til “con&#xAD;sen&#xAD;sus on defi&#xAD;ni&#xAD;tions” is reached — and con&#xAD;sen&#xAD;sus on defi&#xAD;ni&#xAD;tions is it&#xAD;self blocked by China through the cu&#xAD;mu&#xAD;la&#xAD;tive five-crite&#xAD;ria defi&#xAD;ni&#xAD;tion. It is an el&#xAD;e&#xAD;gant diplo&#xAD;matic con&#xAD;struc&#xAD;tion: sup&#xAD;port the prin&#xAD;ci&#xAD;ple while mak&#xAD;ing it prac&#xAD;ti&#xAD;cally in&#xAD;ap&#xAD;pli&#xAD;ca&#xAD;ble.&lt;/p&gt;&lt;p&gt;The struc&#xAD;tural rea&#xAD;son for this gap is that in the Chi&#xAD;nese sys&#xAD;tem, red lines are set di&#xAD;rectly by the Com&#xAD;mu&#xAD;nist Party — not by in&#xAD;de&#xAD;pen&#xAD;dent cor&#xAD;po&#xAD;rate ac&#xAD;tors or ju&#xAD;di&#xAD;cial in&#xAD;sti&#xAD;tu&#xAD;tions. In the Amer&#xAD;i&#xAD;can case, An&#xAD;thropic could sue the Pen&#xAD;tagon and win tac&#xAD;ti&#xAD;cally. In the Euro&#xAD;pean case, EU courts can the&#xAD;o&#xAD;ret&#xAD;i&#xAD;cally challenge the ap&#xAD;pli&#xAD;ca&#xAD;tion of the AI Act. In China, there is no ac&#xAD;tor that could in&#xAD;sti&#xAD;tu&#xAD;tion&#xAD;ally con&#xAD;test a party de&#xAD;ci&#xAD;sion about what level of au&#xAD;ton&#xAD;omy is “ac&#xAD;cept&#xAD;able.” This means red lines are mov&#xAD;able in both di&#xAD;rec&#xAD;tions: the party can tighten them when it needs to send a re&#xAD;as&#xAD;sur&#xAD;ing sig&#xAD;nal to the in&#xAD;ter&#xAD;na&#xAD;tional com&#xAD;mu&#xAD;nity — and loosen them un&#xAD;der op&#xAD;er&#xAD;a&#xAD;tional ne&#xAD;ces&#xAD;sity, with&#xAD;out trig&#xAD;ger&#xAD;ing pub&#xAD;lic pro&#xAD;ceed&#xAD;ings. The ab&#xAD;sence of an in&#xAD;de&#xAD;pen&#xAD;dent ar&#xAD;biter makes de&#xAD;clared con&#xAD;straints struc&#xAD;turally less durable than they ap&#xAD;pear from Geneva.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;III. Cor&#xAD;po&#xAD;rate Ar&#xAD;chi&#xAD;tec&#xAD;ture and State Control&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;The most fun&#xAD;da&#xAD;men&#xAD;tal struc&#xAD;tural differ&#xAD;ence be&#xAD;tween the Chi&#xAD;nese model and the Amer&#xAD;i&#xAD;can or Euro&#xAD;pean one is the com&#xAD;plete ab&#xAD;sence of in&#xAD;de&#xAD;pen&#xAD;dent cor&#xAD;po&#xAD;rate ac&#xAD;tors in the mil&#xAD;i&#xAD;tary AI do&#xAD;main. In the United States, An&#xAD;thropic could pub&#xAD;li&#xAD;cly re&#xAD;fuse the Pen&#xAD;tagon, file a law&#xAD;suit, and — at least tem&#xAD;porar&#xAD;ily — win. This was pos&#xAD;si&#xAD;ble be&#xAD;cause the com&#xAD;pany ex&#xAD;ists as a le&#xAD;gal en&#xAD;tity in&#xAD;de&#xAD;pen&#xAD;dent of the state, ca&#xAD;pa&#xAD;ble of op&#xAD;pos&#xAD;ing it with com&#xAD;mer&#xAD;cial in&#xAD;ter&#xAD;ests and its own value com&#xAD;mit&#xAD;ments. In China, an analo&#xAD;gous con&#xAD;flict is struc&#xAD;turally im&#xAD;pos&#xAD;si&#xAD;ble: com&#xAD;pa&#xAD;nies op&#xAD;er&#xAD;at&#xAD;ing in AI may have pri&#xAD;vate share&#xAD;hold&#xAD;ers and Hong Kong list&#xAD;ings, but they can&#xAD;not in&#xAD;sti&#xAD;tu&#xAD;tion&#xAD;ally re&#xAD;sist a re&#xAD;quest from the party or the PLA.&lt;/p&gt;&lt;p&gt;The level of state con&#xAD;trol over the largest pri&#xAD;vate com&#xAD;pa&#xAD;nies varies, but is always struc&#xAD;turally en&#xAD;sured through sev&#xAD;eral si&#xAD;mul&#xAD;ta&#xAD;neous mechanisms. Direct mechanisms in&#xAD;clude: manda&#xAD;tory state equity stakes or “golden shares” in strate&#xAD;gic com&#xAD;pa&#xAD;nies; party com&#xAD;mit&#xAD;tees in&#xAD;side all cor&#xAD;po&#xAD;ra&#xAD;tions above a cer&#xAD;tain head&#xAD;count thresh&#xAD;old; and the 2017 Na&#xAD;tional In&#xAD;tel&#xAD;li&#xAD;gence Law, which obliges any Chi&#xAD;nese or&#xAD;gani&#xAD;sa&#xAD;tion or cit&#xAD;i&#xAD;zen to sup&#xAD;port state in&#xAD;tel&#xAD;li&#xAD;gence ac&#xAD;tivi&#xAD;ties on de&#xAD;mand. Indi&#xAD;rect mechanisms in&#xAD;clude li&#xAD;cens&#xAD;ing con&#xAD;trol, reg&#xAD;u&#xAD;la&#xAD;tory pres&#xAD;sure, and the pos&#xAD;si&#xAD;bil&#xAD;ity of ad&#xAD;minis&#xAD;tra&#xAD;tive pro&#xAD;ceed&#xAD;ings against ex&#xAD;ec&#xAD;u&#xAD;tives. The fate of Jack Ma is in&#xAD;struc&#xAD;tive: his pub&#xAD;lic crit&#xAD;i&#xAD;cism of reg&#xAD;u&#xAD;la&#xAD;tory policy in late 2020 was fol&#xAD;lowed by months out of pub&#xAD;lic view, af&#xAD;ter which Ant Group lost its IPO and Alibaba re&#xAD;ceived a record an&#xAD;titrust fine. No for&#xAD;mal na&#xAD;tion&#xAD;al&#xAD;i&#xAD;sa&#xAD;tion was re&#xAD;quired — a clear demon&#xAD;stra&#xAD;tion of con&#xAD;se&#xAD;quences was suffi&#xAD;cient.&lt;/p&gt;&lt;p&gt;Direct and com&#xAD;mand-style con&#xAD;trol through MCF car&#xAD;ries mea&#xAD;surable ad&#xAD;van&#xAD;tages. Speed: the de&#xAD;ci&#xAD;sion to in&#xAD;te&#xAD;grate a spe&#xAD;cific tech&#xAD;nol&#xAD;ogy into mil&#xAD;i&#xAD;tary sys&#xAD;tems re&#xAD;quires no ne&#xAD;go&#xAD;ti&#xAD;a&#xAD;tion with an in&#xAD;de&#xAD;pen&#xAD;dent board of di&#xAD;rec&#xAD;tors, no share&#xAD;holder ap&#xAD;proval, no re&#xAD;s&#xAD;olu&#xAD;tion of con&#xAD;flicts with cor&#xAD;po&#xAD;rate eth&#xAD;i&#xAD;cal dec&#xAD;la&#xAD;ra&#xAD;tions. Scale: the state can di&#xAD;rect the re&#xAD;sources of the en&#xAD;tire tech&#xAD;nol&#xAD;ogy sec&#xAD;tor to&#xAD;ward a spe&#xAD;cific task — as hap&#xAD;pened with Deep&#xAD;Seek, whose de&#xAD;vel&#xAD;op&#xAD;ment was ac&#xAD;cel&#xAD;er&#xAD;ated in part by state pres&#xAD;sure fol&#xAD;low&#xAD;ing Amer&#xAD;i&#xAD;can chip sanc&#xAD;tions. Ab&#xAD;sence of leak&#xAD;age: tech&#xAD;nolo&#xAD;gies de&#xAD;vel&#xAD;oped un&#xAD;der MCF do not flow to com&#xAD;peti&#xAD;tors through open-ac&#xAD;cess pub&#xAD;li&#xAD;ca&#xAD;tions or staff de&#xAD;par&#xAD;tures to other coun&#xAD;tries with the same ease as in open ecosys&#xAD;tems. The con&#xAD;straints are sym&#xAD;met&#xAD;ri&#xAD;cal. Speed with&#xAD;out feed&#xAD;back pro&#xAD;duces sys&#xAD;temic er&#xAD;rors that are not cor&#xAD;rected from be&#xAD;low. Scale with&#xAD;out com&#xAD;pe&#xAD;ti&#xAD;tion re&#xAD;duces in&#xAD;cen&#xAD;tives for in&#xAD;no&#xAD;va&#xAD;tion be&#xAD;yond state-defined pri&#xAD;ori&#xAD;ties. The ab&#xAD;sence of in&#xAD;de&#xAD;pen&#xAD;dent ac&#xAD;tors means the ab&#xAD;sence of an in&#xAD;sti&#xAD;tu&#xAD;tional “red team” ca&#xAD;pa&#xAD;ble of iden&#xAD;ti&#xAD;fy&#xAD;ing sys&#xAD;tem weak&#xAD;nesses.&lt;/p&gt;&lt;p&gt;Deep&#xAD;Seek de&#xAD;serves sep&#xAD;a&#xAD;rate con&#xAD;sid&#xAD;er&#xAD;a&#xAD;tion not as a product but as a sys&#xAD;temic phe&#xAD;nomenon. The re&#xAD;lease of Deep&#xAD;Seek R1 in Jan&#xAD;uary 2025, with perfor&#xAD;mance com&#xAD;pa&#xAD;rable to Amer&#xAD;i&#xAD;can fron&#xAD;tier mod&#xAD;els at fun&#xAD;da&#xAD;men&#xAD;tally lower com&#xAD;pu&#xAD;ta&#xAD;tional cost, was a shock to the Western AI com&#xAD;mu&#xAD;nity no less sig&#xAD;nifi&#xAD;cant than the Sput&#xAD;nik launch in 1957. For un&#xAD;der&#xAD;stand&#xAD;ing Chi&#xAD;nese mil&#xAD;i&#xAD;tary AI, what mat&#xAD;ters is not the model qual&#xAD;ity per se but three con&#xAD;se&#xAD;quences of its emer&#xAD;gence. First: Deep&#xAD;Seek demon&#xAD;strates that sanc&#xAD;tions on high-qual&#xAD;ity NVIDIA chips have not stopped Chi&#xAD;nese AI de&#xAD;vel&#xAD;op&#xAD;ment — they forced de&#xAD;vel&#xAD;op&#xAD;ers to find more effi&#xAD;cient ar&#xAD;chi&#xAD;tec&#xAD;tural solu&#xAD;tions, which in some con&#xAD;texts is more ad&#xAD;van&#xAD;ta&#xAD;geous than sim&#xAD;ply hav&#xAD;ing more com&#xAD;put&#xAD;ing power. Se&#xAD;cond: the model runs on sig&#xAD;nifi&#xAD;cantly more mod&#xAD;est hard&#xAD;ware, mak&#xAD;ing it suit&#xAD;able for de&#xAD;ploy&#xAD;ment at the edge of the net&#xAD;work — on board drones, in field com&#xAD;mand posts, in sys&#xAD;tems that can&#xAD;not rely on cloud in&#xAD;fras&#xAD;truc&#xAD;ture. Third, a Chi&#xAD;nese re&#xAD;search team used Deep&#xAD;Seek to re&#xAD;con&#xAD;struct 10,000 po&#xAD;ten&#xAD;tial bat&#xAD;tlefield situ&#xAD;a&#xAD;tions in 48 sec&#xAD;onds — a task that would tra&#xAD;di&#xAD;tion&#xAD;ally take hu&#xAD;man com&#xAD;man&#xAD;ders &lt;a href=&quot;https://interestingengineering.com/military/china-turns-deepseek-into-war-commander&quot;&gt;ap&#xAD;prox&#xAD;i&#xAD;mately 48 hours&lt;/a&gt; . This is not a metaphor; it de&#xAD;scribes a spe&#xAD;cific op&#xAD;er&#xAD;a&#xAD;tional ap&#xAD;pli&#xAD;ca&#xAD;tion of a lan&#xAD;guage model in mil&#xAD;i&#xAD;tary plan&#xAD;ning.&lt;/p&gt;&lt;p&gt;The over&#xAD;all as&#xAD;sess&#xAD;ment of the sym&#xAD;bio&#xAD;sis of mil&#xAD;i&#xAD;tary, cor&#xAD;po&#xAD;ra&#xAD;tion, and state in China re&#xAD;veals a para&#xAD;dox op&#xAD;po&#xAD;site to the Euro&#xAD;pean one. In the EU, the cen&#xAD;tral prob&#xAD;lem is how to en&#xAD;sure that nor&#xAD;ma&#xAD;tive re&#xAD;quire&#xAD;ments are ob&#xAD;served in the mil&#xAD;i&#xAD;tary do&#xAD;main in the ab&#xAD;sence of an en&#xAD;force&#xAD;ment mechanism. In China, the cen&#xAD;tral prob&#xAD;lem is the re&#xAD;verse: how to en&#xAD;sure that a sys&#xAD;tem in which the en&#xAD;force&#xAD;ment mechanism is ab&#xAD;solute does not lose its ca&#xAD;pac&#xAD;ity for adap&#xAD;ta&#xAD;tion and in&#xAD;no&#xAD;va&#xAD;tion un&#xAD;der the pres&#xAD;sure of that very en&#xAD;force&#xAD;ment. Party con&#xAD;trol en&#xAD;sures speed and re&#xAD;source con&#xAD;soli&#xAD;da&#xAD;tion. It also cre&#xAD;ates in&#xAD;for&#xAD;ma&#xAD;tion filters that im&#xAD;pede the up&#xAD;ward trans&#xAD;mis&#xAD;sion of nega&#xAD;tive sig&#xAD;nals — which in a mil&#xAD;i&#xAD;tary con&#xAD;text means the risk of sys&#xAD;tem&#xAD;atic over&#xAD;es&#xAD;ti&#xAD;ma&#xAD;tion of one’s own ca&#xAD;pa&#xAD;bil&#xAD;ities by com&#xAD;mand struc&#xAD;tures de&#xAD;prived of re&#xAD;li&#xAD;able feed&#xAD;back.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;IV. Drivers of Rapid Devel&#xAD;op&#xAD;ment and Struc&#xAD;tural Risks&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;The rapid de&#xAD;vel&#xAD;op&#xAD;ment of Chi&#xAD;nese mil&#xAD;i&#xAD;tary AI sys&#xAD;tems is ex&#xAD;plained by the cu&#xAD;mu&#xAD;la&#xAD;tive effect of sev&#xAD;eral si&#xAD;mul&#xAD;ta&#xAD;neously op&#xAD;er&#xAD;at&#xAD;ing fac&#xAD;tors. A strong state role al&#xAD;lows the set&#xAD;ting of long-term tech&#xAD;nolog&#xAD;i&#xAD;cal pri&#xAD;ori&#xAD;ties in&#xAD;de&#xAD;pen&#xAD;dent of elec&#xAD;toral cy&#xAD;cles or mar&#xAD;ket con&#xAD;di&#xAD;tions. A pow&#xAD;er&#xAD;ful econ&#xAD;omy — the world’s sec&#xAD;ond largest — pro&#xAD;vides a re&#xAD;source base com&#xAD;pa&#xAD;rable to the Amer&#xAD;i&#xAD;can one. Re&#xAD;source con&#xAD;soli&#xAD;da&#xAD;tion through MCF al&#xAD;lows the en&#xAD;tire civilian tech&#xAD;nolog&#xAD;i&#xAD;cal po&#xAD;ten&#xAD;tial to be di&#xAD;rected to&#xAD;ward mil&#xAD;i&#xAD;tary needs with&#xAD;out for&#xAD;mal na&#xAD;tion&#xAD;al&#xAD;i&#xAD;sa&#xAD;tion. The sub&#xAD;or&#xAD;di&#xAD;na&#xAD;tion of AI com&#xAD;pa&#xAD;nies elimi&#xAD;nates the fric&#xAD;tion be&#xAD;tween cor&#xAD;po&#xAD;rate and state in&#xAD;ter&#xAD;ests that is well-doc&#xAD;u&#xAD;mented in the Amer&#xAD;i&#xAD;can case. Gen&#xAD;er&#xAD;ous in&#xAD;cen&#xAD;tives for spe&#xAD;cial&#xAD;ists — high salaries, state grants, hous&#xAD;ing pro&#xAD;grammes — par&#xAD;tially offset the brain drain, though they do not elimi&#xAD;nate it. The ab&#xAD;sence of rigor&#xAD;ous le&#xAD;gal reg&#xAD;u&#xAD;la&#xAD;tion in the style of the AI Act re&#xAD;moves bar&#xAD;ri&#xAD;ers that slow de&#xAD;vel&#xAD;op&#xAD;ment and de&#xAD;ploy&#xAD;ment. And fi&#xAD;nally, the state’s pri&#xAD;or&#xAD;ity ac&#xAD;cess to any data — in&#xAD;clud&#xAD;ing the enor&#xAD;mous cor&#xAD;pus of data on the pop&#xAD;u&#xAD;la&#xAD;tion, user be&#xAD;havi&#xAD;our, and civilian in&#xAD;fras&#xAD;truc&#xAD;ture — cre&#xAD;ates train&#xAD;ing datasets of a scale un&#xAD;available to most other ac&#xAD;tors.&lt;/p&gt;&lt;p&gt;The chip in&#xAD;de&#xAD;pen&#xAD;dence prob&#xAD;lem de&#xAD;serves par&#xAD;tic&#xAD;u&#xAD;lar at&#xAD;ten&#xAD;tion. Amer&#xAD;i&#xAD;can sanc&#xAD;tions of 2022–2024 sub&#xAD;stan&#xAD;tially re&#xAD;stricted China’s ac&#xAD;cess to ad&#xAD;vanced NVIDIA semi&#xAD;con&#xAD;duc&#xAD;tors re&#xAD;quired for train&#xAD;ing fron&#xAD;tier mod&#xAD;els. The re&#xAD;sponse has been a two-track strat&#xAD;egy. On one hand, Huawei is de&#xAD;vel&#xAD;op&#xAD;ing its own As&#xAD;cend AI chips, which re&#xAD;main sig&#xAD;nifi&#xAD;cantly be&#xAD;low NVIDIA’s perfor&#xAD;mance but are grad&#xAD;u&#xAD;ally clos&#xAD;ing the gap. On the other hand — and here Deep&#xAD;Seek served as a struc&#xAD;tural ar&#xAD;gu&#xAD;ment — China is in&#xAD;vest&#xAD;ing in al&#xAD;gorith&#xAD;mic effi&#xAD;ciency: cre&#xAD;at&#xAD;ing mod&#xAD;els ca&#xAD;pa&#xAD;ble of perform&#xAD;ing com&#xAD;pa&#xAD;rable tasks with fewer pa&#xAD;ram&#xAD;e&#xAD;ters and lower com&#xAD;pu&#xAD;ta&#xAD;tional costs. This does not elimi&#xAD;nate semi&#xAD;con&#xAD;duc&#xAD;tor de&#xAD;pen&#xAD;dence en&#xAD;tirely, but it re&#xAD;duces its op&#xAD;er&#xAD;a&#xAD;tional acu&#xAD;ity and cre&#xAD;ates an al&#xAD;ter&#xAD;na&#xAD;tive de&#xAD;vel&#xAD;op&#xAD;ment path not block&#xAD;able by ex&#xAD;port re&#xAD;stric&#xAD;tions.&lt;/p&gt;&lt;p&gt;Taiwan is the world’s lead&#xAD;ing pro&#xAD;ducer of ad&#xAD;vanced semi&#xAD;con&#xAD;duc&#xAD;tors — ap&#xAD;prox&#xAD;i&#xAD;mately 90% of chips at 7nm and be&#xAD;low are man&#xAD;u&#xAD;fac&#xAD;tured by TSMC on the is&#xAD;land. For China, this means that com&#xAD;plete tech&#xAD;nolog&#xAD;i&#xAD;cal self-suffi&#xAD;ciency in AI is im&#xAD;pos&#xAD;si&#xAD;ble with&#xAD;out re&#xAD;solv&#xAD;ing the Taiwan ques&#xAD;tion — or with&#xAD;out cre&#xAD;at&#xAD;ing do&#xAD;mes&#xAD;tic pro&#xAD;duc&#xAD;tion ca&#xAD;pac&#xAD;ity com&#xAD;pa&#xAD;rable to TSMC, which re&#xAD;quires at min&#xAD;i&#xAD;mum a decade of in&#xAD;ten&#xAD;sive in&#xAD;vest&#xAD;ment. For Western coun&#xAD;tries, it means that any sce&#xAD;nario of Chi&#xAD;nese mil&#xAD;i&#xAD;tary pres&#xAD;sure on Taiwan af&#xAD;fects not only the is&#xAD;land’s strate&#xAD;gic sovereignty but also the global semi&#xAD;con&#xAD;duc&#xAD;tor sup&#xAD;ply chains on which the en&#xAD;tire world’s mil&#xAD;i&#xAD;tary and civilian AI sec&#xAD;tor de&#xAD;pends. Taiwan in this sense is not merely a geopoli&#xAD;ti&#xAD;cal flash&#xAD;point, but liter&#xAD;ally the phys&#xAD;i&#xAD;cal in&#xAD;fras&#xAD;truc&#xAD;ture on which the AI arms race runs.&lt;/p&gt;&lt;p&gt;The suc&#xAD;cess of the Chi&#xAD;nese model — con&#xAD;soli&#xAD;dated, dereg&#xAD;u&#xAD;lated, and state-man&#xAD;aged — cre&#xAD;ates struc&#xAD;tural pres&#xAD;sure on other ac&#xAD;tors. If MCF al&#xAD;lows China to de&#xAD;velop and de&#xAD;ploy mil&#xAD;i&#xAD;tary AI sys&#xAD;tems faster than com&#xAD;pet&#xAD;i&#xAD;tive mar&#xAD;ket mod&#xAD;els with in&#xAD;de&#xAD;pen&#xAD;dent ac&#xAD;tors and nor&#xAD;ma&#xAD;tive con&#xAD;straints, this cre&#xAD;ates an in&#xAD;cen&#xAD;tive — con&#xAD;scious or not — for other states to move to&#xAD;ward greater dereg&#xAD;u&#xAD;la&#xAD;tion and state con&#xAD;trol. Hegseth’s mem&#xAD;o&#xAD;ran&#xAD;dum of 9 Jan&#xAD;uary 2026, with its “any lawful use” re&#xAD;quire&#xAD;ment and the elimi&#xAD;na&#xAD;tion of “ide&#xAD;olog&#xAD;i&#xAD;cal tun&#xAD;ing,” can be in&#xAD;ter&#xAD;preted in part as a re&#xAD;sponse to this struc&#xAD;tural challenge: an at&#xAD;tempt to re&#xAD;pro&#xAD;duce some of the op&#xAD;er&#xAD;a&#xAD;tional ad&#xAD;van&#xAD;tages of the Chi&#xAD;nese model with&#xAD;out for&#xAD;mally re&#xAD;plac&#xAD;ing the mar&#xAD;ket ar&#xAD;chi&#xAD;tec&#xAD;ture. This is not con&#xAD;ver&#xAD;gence of sys&#xAD;tems, but it is pres&#xAD;sure in the di&#xAD;rec&#xAD;tion of con&#xAD;ver&#xAD;gence.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;V. The Com&#xAD;bat Ex&#xAD;pe&#xAD;rience Deficit and Strate&#xAD;gic Choice&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;For all China’s tech&#xAD;nolog&#xAD;i&#xAD;cal am&#xAD;bi&#xAD;tion, the struc&#xAD;tural vuln&#xAD;er&#xAD;a&#xAD;bil&#xAD;ity of the PLA is this: the PLA has not fought any se&#xAD;ri&#xAD;ous con&#xAD;flict since 1979, and effec&#xAD;tively no one in the mil&#xAD;i&#xAD;tary &lt;a href=&quot;https://www.nationaldefensemagazine.org/articles/2026/3/23/algorithmic-warfare-china-seeking-ai-to-counter-us-military-strengths&quot;&gt;has real com&#xAD;bat ex&#xAD;pe&#xAD;rience&lt;/a&gt; . This is not merely a statis&#xAD;ti&#xAD;cal fact — it is a source of deep an&#xAD;a&#xAD;lyt&#xAD;i&#xAD;cal un&#xAD;cer&#xAD;tainty. The PLA’s de&#xAD;clared ca&#xAD;pa&#xAD;bil&#xAD;ities are based on ex&#xAD;er&#xAD;cises, simu&#xAD;la&#xAD;tions, and ex&#xAD;trap&#xAD;o&#xAD;la&#xAD;tion from open sources. How closely they cor&#xAD;re&#xAD;spond to re&#xAD;al&#xAD;ity is un&#xAD;known, and im&#xAD;pos&#xAD;si&#xAD;ble to ver&#xAD;ify with&#xAD;out an ac&#xAD;tual con&#xAD;flict.&lt;/p&gt;&lt;p&gt;The para&#xAD;dox is that China — de&#xAD;spite hav&#xAD;ing the most am&#xAD;bi&#xAD;tious tech&#xAD;ni&#xAD;cal pro&#xAD;gramme among the four ac&#xAD;tors — has less ex&#xAD;pe&#xAD;rience of con&#xAD;duct&#xAD;ing mod&#xAD;ern high-in&#xAD;ten&#xAD;sity war&#xAD;fare than any of the oth&#xAD;ers. The United States has been through Afghanistan, Iraq, Syria, and now Iran. Rus&#xAD;sia has been fight&#xAD;ing in Ukraine for four years. Euro&#xAD;pean states — the UK, France, Ger&#xAD;many — have par&#xAD;ti&#xAD;ci&#xAD;pated in NATO op&#xAD;er&#xAD;a&#xAD;tions from Kosovo to Libya and Mali. The PLA ob&#xAD;serves all of this from the sidelines, ac&#xAD;cu&#xAD;mu&#xAD;lat&#xAD;ing an&#xAD;a&#xAD;lyt&#xAD;i&#xAD;cal con&#xAD;clu&#xAD;sions but not com&#xAD;bat ex&#xAD;pe&#xAD;rience in the op&#xAD;er&#xAD;a&#xAD;tional sense.&lt;/p&gt;&lt;p&gt;This raises a ques&#xAD;tion for which the hon&#xAD;est an&#xAD;swer is an ac&#xAD;knowl&#xAD;edge&#xAD;ment of the limits of knowl&#xAD;edge: how effec&#xAD;tive are other PLA branches in the con&#xAD;di&#xAD;tions of mod&#xAD;ern war? The navy, the rocket forces, the air force — all have un&#xAD;der&#xAD;gone large-scale mod&#xAD;erni&#xAD;sa&#xAD;tion and reg&#xAD;u&#xAD;larly demon&#xAD;strate im&#xAD;pres&#xAD;sive tech&#xAD;ni&#xAD;cal ca&#xAD;pa&#xAD;bil&#xAD;ities in ex&#xAD;er&#xAD;cises. But the de&#xAD;clared power of these branches has never been tested un&#xAD;der the real pres&#xAD;sure of com&#xAD;bat op&#xAD;po&#xAD;si&#xAD;tion. Deng Xiaop&#xAD;ing, ex&#xAD;plain&#xAD;ing the ne&#xAD;ces&#xAD;sity of the 1979 in&#xAD;va&#xAD;sion of Viet&#xAD;nam, par&#xAD;tially used pre&#xAD;cisely this ar&#xAD;gu&#xAD;ment: the army needs com&#xAD;bat ex&#xAD;pe&#xAD;rience, and with&#xAD;out it even well-equipped forces re&#xAD;main the&#xAD;o&#xAD;ret&#xAD;i&#xAD;cally un&#xAD;ver&#xAD;ified. The re&#xAD;sult of that war was a com&#xAD;plete con&#xAD;fir&#xAD;ma&#xAD;tion of the the&#xAD;sis — only with the op&#xAD;po&#xAD;site sign: the PLA performed sig&#xAD;nifi&#xAD;cantly worse than its de&#xAD;clared ca&#xAD;pa&#xAD;bil&#xAD;ities sug&#xAD;gested. There is no ba&#xAD;sis for be&#xAD;liev&#xAD;ing this les&#xAD;son has be&#xAD;come ob&#xAD;so&#xAD;lete.&lt;/p&gt;&lt;p&gt;This pro&#xAD;duces a closed loop. The longer China re&#xAD;frains from ac&#xAD;tive mil&#xAD;i&#xAD;tary en&#xAD;gage&#xAD;ment, the longer the PLA’s rep&#xAD;u&#xAD;ta&#xAD;tion as one of the world’s strongest mil&#xAD;i&#xAD;taries is pre&#xAD;served — a rep&#xAD;u&#xAD;ta&#xAD;tion based on tech&#xAD;ni&#xAD;cal char&#xAD;ac&#xAD;ter&#xAD;is&#xAD;tics, bud&#xAD;gets, and plat&#xAD;form counts, not on com&#xAD;bat re&#xAD;sults. But si&#xAD;mul&#xAD;ta&#xAD;neously, the more the gap ac&#xAD;cu&#xAD;mu&#xAD;lates rel&#xAD;a&#xAD;tive to armies that ac&#xAD;tu&#xAD;ally fight: the United States in Iran, Rus&#xAD;sia in Ukraine, Euro&#xAD;peans — if par&#xAD;tially — through the Ukrainian ex&#xAD;pe&#xAD;rience of their sys&#xAD;tems. Not fight&#xAD;ing pre&#xAD;serves the rep&#xAD;u&#xAD;ta&#xAD;tion and avoids the costs. It also pre&#xAD;serves fun&#xAD;da&#xAD;men&#xAD;tal un&#xAD;cer&#xAD;tainty about ac&#xAD;tual ca&#xAD;pa&#xAD;bil&#xAD;ities.&lt;/p&gt;&lt;p&gt;The rea&#xAD;sons for this re&#xAD;straint are not ide&#xAD;olog&#xAD;i&#xAD;cal but en&#xAD;tirely prag&#xAD;matic. The eco&#xAD;nomic pri&#xAD;or&#xAD;ity: China in the 2020s is above all a trad&#xAD;ing power with an econ&#xAD;omy deeply in&#xAD;te&#xAD;grated into global sup&#xAD;ply chains. Mili&#xAD;tary con&#xAD;flict cre&#xAD;ates risks of sanc&#xAD;tions, trade dis&#xAD;rup&#xAD;tion, and tech&#xAD;nolog&#xAD;i&#xAD;cal iso&#xAD;la&#xAD;tion — costs in&#xAD;com&#xAD;men&#xAD;su&#xAD;rable with the po&#xAD;ten&#xAD;tial gains of most con&#xAD;ceiv&#xAD;able op&#xAD;er&#xAD;a&#xAD;tions. The diplo&#xAD;matic pri&#xAD;or&#xAD;ity: China is ac&#xAD;tively build&#xAD;ing an image as a re&#xAD;spon&#xAD;si&#xAD;ble me&#xAD;di&#xAD;at&#xAD;ing power — the 2023 Saudi-Ira&#xAD;nian nor&#xAD;mal&#xAD;i&#xAD;sa&#xAD;tion was the sym&#xAD;bolic achieve&#xAD;ment of this strat&#xAD;egy. Mili&#xAD;tary in&#xAD;ter&#xAD;ven&#xAD;tion un&#xAD;der&#xAD;mines that image. The rep&#xAD;u&#xAD;ta&#xAD;tional risk: the fear of be&#xAD;com&#xAD;ing bogged down in a war on the model of Rus&#xAD;sia in Ukraine or the United States in Iran. The ab&#xAD;sence of in&#xAD;sti&#xAD;tu&#xAD;tion&#xAD;al&#xAD;ised mil&#xAD;i&#xAD;tary en&#xAD;gage&#xAD;ment on the in&#xAD;ter&#xAD;na&#xAD;tional stage: un&#xAD;like the United States, China has no es&#xAD;tab&#xAD;lished sys&#xAD;tem of al&#xAD;lies ready to share mil&#xAD;i&#xAD;tary costs.&lt;/p&gt;&lt;p&gt;The con&#xAD;se&#xAD;quence of all this is the ne&#xAD;ces&#xAD;sity of adapt&#xAD;ing a mil&#xAD;i&#xAD;tary sys&#xAD;tem by draw&#xAD;ing largely on the ex&#xAD;pe&#xAD;rience of other coun&#xAD;tries’ wars. Ob&#xAD;serv&#xAD;ing Ukraine through the Rus&#xAD;sian-Chi&#xAD;nese chan&#xAD;nel, analysing Amer&#xAD;i&#xAD;can op&#xAD;er&#xAD;a&#xAD;tions in Iran through MizarVi&#xAD;sion and the trilat&#xAD;eral agree&#xAD;ment — all of these are sur&#xAD;ro&#xAD;gates for di&#xAD;rect com&#xAD;bat ex&#xAD;pe&#xAD;rience. Valuable sur&#xAD;ro&#xAD;gates, but in&#xAD;com&#xAD;plete ones: an&#xAD;other coun&#xAD;try’s ex&#xAD;pe&#xAD;rience always car&#xAD;ries the prob&#xAD;lem of ap&#xAD;pli&#xAD;ca&#xAD;bil&#xAD;ity. What works in Ukrainian steppe con&#xAD;di&#xAD;tions against Soviet-era doc&#xAD;trines and Western weaponry may work in fun&#xAD;da&#xAD;men&#xAD;tally differ&#xAD;ent ways in a Taiwan sce&#xAD;nario with its mar&#xAD;i&#xAD;time com&#xAD;po&#xAD;nent, air-defence-sat&#xAD;u&#xAD;rated en&#xAD;vi&#xAD;ron&#xAD;ment, and Amer&#xAD;i&#xAD;can for&#xAD;ward pres&#xAD;ence.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;VI. Com&#xAD;pen&#xAD;satory Strategies&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;The pri&#xAD;mary at&#xAD;tempt to re&#xAD;solve the con&#xAD;flict be&#xAD;tween the PLA’s de&#xAD;clared power and the real ab&#xAD;sence of com&#xAD;bat ex&#xAD;pe&#xAD;rience is in&#xAD;di&#xAD;rect sup&#xAD;port for Rus&#xAD;sia. Ac&#xAD;cord&#xAD;ing to Ukrainian in&#xAD;tel&#xAD;li&#xAD;gence, Chi&#xAD;nese fac&#xAD;to&#xAD;ries and com&#xAD;pa&#xAD;nies sup&#xAD;ply Rus&#xAD;sia with hard&#xAD;ware and AI soft&#xAD;ware for adapt&#xAD;ing un&#xAD;manned sys&#xAD;tems. In 2025, Rus&#xAD;sia used Chi&#xAD;nese com&#xAD;po&#xAD;nents to pro&#xAD;duce up to &lt;a href=&quot;https://oe.t2com.army.mil/product/russia-benefitting-in-ukraine-war-from-ai-collaboration-with-u-s-adversaries/&quot;&gt;2 mil&#xAD;lion small tac&#xAD;ti&#xAD;cal UAVs&lt;/a&gt; . For China, this is nei&#xAD;ther char&#xAD;ity nor ide&#xAD;olog&#xAD;i&#xAD;cal soli&#xAD;dar&#xAD;ity — it is paid ob&#xAD;ser&#xAD;va&#xAD;tion of how tech&#xAD;nolo&#xAD;gies func&#xAD;tion un&#xAD;der real com&#xAD;bat con&#xAD;di&#xAD;tions. By sup&#xAD;ply&#xAD;ing com&#xAD;po&#xAD;nents and soft&#xAD;ware, China re&#xAD;ceives data on how those com&#xAD;po&#xAD;nents perform un&#xAD;der load, what fails and why, and how the ad&#xAD;ver&#xAD;sary adapts to spe&#xAD;cific tech&#xAD;ni&#xAD;cal solu&#xAD;tions.&lt;/p&gt;&lt;p&gt;The trilat&#xAD;eral strate&#xAD;gic pact be&#xAD;tween China, Rus&#xAD;sia, and Iran, signed on 29 Jan&#xAD;uary 2026, for&#xAD;mal&#xAD;ises what was pre&#xAD;vi&#xAD;ously in&#xAD;for&#xAD;mal co&#xAD;op&#xAD;er&#xAD;a&#xAD;tion. The pact is not a mu&#xAD;tual defence treaty — it pro&#xAD;vides diplo&#xAD;matic cover, in&#xAD;tel&#xAD;li&#xAD;gence co&#xAD;op&#xAD;er&#xAD;a&#xAD;tion, eco&#xAD;nomic re&#xAD;silience, and &lt;a href=&quot;https://www.hstoday.us/subject-matter-areas/counterterrorism/iran-responds-to-operation-epic-fury-with-layered-military-cyber-and-proxy-strategy-amid-escalation-constraints/&quot;&gt;tech&#xAD;nolog&#xAD;i&#xAD;cal sup&#xAD;port&lt;/a&gt; . For China, it cre&#xAD;ates a le&#xAD;gi&#xAD;t&#xAD;i&#xAD;mate frame&#xAD;work for re&#xAD;ceiv&#xAD;ing in&#xAD;tel&#xAD;li&#xAD;gence data on US op&#xAD;er&#xAD;a&#xAD;tions in Iran with&#xAD;out be&#xAD;ing a party to the con&#xAD;flict. This max&#xAD;imises the in&#xAD;for&#xAD;ma&#xAD;tion gain while min&#xAD;imis&#xAD;ing di&#xAD;rect costs.&lt;/p&gt;&lt;p&gt;Within this strat&#xAD;egy, MizarVi&#xAD;sion has be&#xAD;come the most pub&#xAD;li&#xAD;cly doc&#xAD;u&#xAD;mented ex&#xAD;am&#xAD;ple of “ob&#xAD;ser&#xAD;va&#xAD;tion through the pri&#xAD;vate sec&#xAD;tor.” The com&#xAD;pany, founded in 2021 and hold&#xAD;ing a Chi&#xAD;nese na&#xAD;tional mil&#xAD;i&#xAD;tary stan&#xAD;dard cer&#xAD;tifi&#xAD;cate, uses AI to cat&#xAD;a&#xAD;logue ac&#xAD;tivity at Amer&#xAD;i&#xAD;can bases in the Mid&#xAD;dle East, track fleet move&#xAD;ments, and lo&#xAD;cate air&#xAD;craft and mis&#xAD;sile defence sys&#xAD;tems. Its data pro&#xAD;vided de&#xAD;tailed cov&#xAD;er&#xAD;age of the build-up of Amer&#xAD;i&#xAD;can forces ahead of Oper&#xAD;a&#xAD;tion Epic Fury, &lt;a href=&quot;https://www.kyivpost.com/post/73270&quot;&gt;in&#xAD;clud&#xAD;ing the tran&#xAD;sit of the air&#xAD;craft car&#xAD;ri&#xAD;ers USS Ger&#xAD;ald R. Ford and USS Abra&#xAD;ham Lin&#xAD;coln&lt;/a&gt; . For&#xAD;mally the com&#xAD;pany is pri&#xAD;vate, for&#xAD;mally it re&#xAD;ceives no state as&#xAD;sign&#xAD;ments. In prac&#xAD;tice it performs a state in&#xAD;tel&#xAD;li&#xAD;gence task — col&#xAD;lect&#xAD;ing data on the real-world ap&#xAD;pli&#xAD;ca&#xAD;tion of Amer&#xAD;i&#xAD;can mil&#xAD;i&#xAD;tary sys&#xAD;tems, which can sub&#xAD;se&#xAD;quently be used to train Chi&#xAD;nese tar&#xAD;get&#xAD;ing and plan&#xAD;ning sys&#xAD;tems. This is the model of “de&#xAD;ni&#xAD;able ob&#xAD;ser&#xAD;va&#xAD;tion”: the state re&#xAD;ceives in&#xAD;tel&#xAD;li&#xAD;gence product while main&#xAD;tain&#xAD;ing plau&#xAD;si&#xAD;ble de&#xAD;ni&#xAD;a&#xAD;bil&#xAD;ity of di&#xAD;rect in&#xAD;volve&#xAD;ment.&lt;/p&gt;&lt;p&gt;In par&#xAD;allel, China is bet&#xAD;ting on the propo&#xAD;si&#xAD;tion that tech&#xAD;nolog&#xAD;i&#xAD;cal su&#xAD;pe&#xAD;ri&#xAD;or&#xAD;ity can com&#xAD;pen&#xAD;sate for the deficit of com&#xAD;bat ex&#xAD;pe&#xAD;rience. The logic is clear: if a sys&#xAD;tem can pro&#xAD;cess data faster than a hu&#xAD;man, make de&#xAD;ci&#xAD;sions in mil&#xAD;lisec&#xAD;onds, and co&#xAD;or&#xAD;di&#xAD;nate thou&#xAD;sands of units in a swarm op&#xAD;er&#xAD;a&#xAD;tion, an ex&#xAD;pe&#xAD;rienced com&#xAD;man&#xAD;der is not needed — a good al&#xAD;gorithm is suffi&#xAD;cient. China has turned to AI as a sub&#xAD;sti&#xAD;tute for di&#xAD;rect com&#xAD;bat ex&#xAD;pe&#xAD;rience, de&#xAD;vel&#xAD;op&#xAD;ing high-fidelity wargam&#xAD;ing plat&#xAD;forms, &lt;a href=&quot;https://www.fpri.org/article/2025/03/ai-dependence-and-political-blind-spots-undermine-beijings-war-strategy/&quot;&gt;pre&#xAD;dic&#xAD;tive mod&#xAD;el&#xAD;ling, and al&#xAD;gorithm-driven war plan&#xAD;ning&lt;/a&gt; . The “War Skull” sys&#xAD;tem in its sec&#xAD;ond gen&#xAD;er&#xAD;a&#xAD;tion adapts mod&#xAD;u&#xAD;lar strate&#xAD;gies to differ&#xAD;ent ad&#xAD;ver&#xAD;saries. The “Aiwu LLM+” AI sys&#xAD;tem in&#xAD;te&#xAD;grates lan&#xAD;guage mod&#xAD;els and mul&#xAD;ti&#xAD;modal data anal&#xAD;y&#xAD;sis &lt;a href=&quot;https://www.defenseone.com/threats/2025/03/new-products-show-chinas-quest-automate-battle/403387/&quot;&gt;for sup&#xAD;port of com&#xAD;mand in&#xAD;for&#xAD;ma&#xAD;tion sys&#xAD;tems&lt;/a&gt; .&lt;/p&gt;&lt;p&gt;The ag&#xAD;gre&#xAD;gate as&#xAD;sess&#xAD;ment of the com&#xAD;pen&#xAD;satory strate&#xAD;gies is as fol&#xAD;lows. All of them work at the level of an&#xAD;a&#xAD;lyt&#xAD;i&#xAD;cal knowl&#xAD;edge — China gen&#xAD;uinely un&#xAD;der&#xAD;stands mod&#xAD;ern war&#xAD;fare sig&#xAD;nifi&#xAD;cantly bet&#xAD;ter than it did in 2019. None of them is the equiv&#xAD;a&#xAD;lent of di&#xAD;rect com&#xAD;bat ex&#xAD;pe&#xAD;rience in the op&#xAD;er&#xAD;a&#xAD;tional sense. Si&#xAD;mu&#xAD;la&#xAD;tions are limited by the qual&#xAD;ity of the data on which they are trained. Ob&#xAD;serv&#xAD;ing Ukraine pro&#xAD;vides les&#xAD;sons from a spe&#xAD;cific the&#xAD;atre of op&#xAD;er&#xAD;a&#xAD;tions that may trans&#xAD;fer poorly to a Taiwan sce&#xAD;nario. The Ira&#xAD;nian con&#xAD;flict shows the ap&#xAD;pli&#xAD;ca&#xAD;tion of Amer&#xAD;i&#xAD;can AI sys&#xAD;tems against a state ad&#xAD;ver&#xAD;sary with sig&#xAD;nifi&#xAD;cantly weaker ca&#xAD;pa&#xAD;bil&#xAD;ities than the PLA. A fun&#xAD;da&#xAD;men&#xAD;tal un&#xAD;cer&#xAD;tainty re&#xAD;mains: how will sys&#xAD;tems tested only in simu&#xAD;la&#xAD;tions perform when they en&#xAD;counter a real ad&#xAD;ver&#xAD;sary ca&#xAD;pa&#xAD;ble of un&#xAD;pre&#xAD;dictable adap&#xAD;ta&#xAD;tion? China does not have an an&#xAD;swer to this ques&#xAD;tion. And that is pre&#xAD;cisely what makes its strate&#xAD;gic po&#xAD;si&#xAD;tion in mil&#xAD;i&#xAD;tary AI si&#xAD;mul&#xAD;ta&#xAD;neously am&#xAD;bi&#xAD;tious and fun&#xAD;da&#xAD;men&#xAD;tally un&#xAD;ver&#xAD;ified.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;VII. Strate&#xAD;gic Assessment&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;The Jus&#xAD;tice Mis&#xAD;sion 2025 ex&#xAD;er&#xAD;cises, con&#xAD;ducted on 29–30 De&#xAD;cem&#xAD;ber 2025, illus&#xAD;trate the strat&#xAD;egy of “over&#xAD;hang&#xAD;ing threat” in its op&#xAD;er&#xAD;a&#xAD;tional em&#xAD;bod&#xAD;i&#xAD;ment. The drills cov&#xAD;ered a larger zone around Taiwan than any of the six pre&#xAD;vi&#xAD;ous ma&#xAD;jor ex&#xAD;er&#xAD;cises since 2022, and for the first time ex&#xAD;plic&#xAD;itly des&#xAD;ig&#xAD;nated de&#xAD;ter&#xAD;rence of ex&#xAD;ter&#xAD;nal in&#xAD;ter&#xAD;ven&#xAD;tion &lt;a href=&quot;https://thediplomat.com/2026/01/chinas-taiwan-drills-are-crossing-a-new-line/&quot;&gt;as a pub&#xAD;li&#xAD;cly stated ob&#xAD;jec&#xAD;tive&lt;/a&gt; . 130 air&#xAD;craft sor&#xAD;ties, 14 war&#xAD;ships, live rocket launches into wa&#xAD;ters north and south&#xAD;west of the is&#xAD;land, simu&#xAD;la&#xAD;tion of block&#xAD;ades of the ports of Keelung and Kaoh&#xAD;si&#xAD;ung. Cru&#xAD;cially, the Pen&#xAD;tagon re&#xAD;ceived no ad&#xAD;vance warn&#xAD;ing of the ex&#xAD;er&#xAD;cises — and Trump pub&#xAD;li&#xAD;cly ex&#xAD;pressed no par&#xAD;tic&#xAD;u&#xAD;lar con&#xAD;cern. This it&#xAD;self is part of the mes&#xAD;sage: China demon&#xAD;strates its ca&#xAD;pac&#xAD;ity to cre&#xAD;ate a naval crisis with&#xAD;out es&#xAD;ca&#xAD;lat&#xAD;ing to a level re&#xAD;quiring an Amer&#xAD;i&#xAD;can re&#xAD;sponse.&lt;/p&gt;&lt;p&gt;The cur&#xAD;rent model, in all like&#xAD;li&#xAD;hood, suits the Chi&#xAD;nese gov&#xAD;ern&#xAD;ment — and there are no signs it in&#xAD;tends to de&#xAD;part from it. Con&#xAD;tinued build-up of mil&#xAD;i&#xAD;tary ca&#xAD;pa&#xAD;bil&#xAD;ities and AI tech&#xAD;nolo&#xAD;gies with&#xAD;out real com&#xAD;bat de&#xAD;ploy&#xAD;ment cre&#xAD;ates max&#xAD;i&#xAD;mum diplo&#xAD;matic lev&#xAD;er&#xAD;age at min&#xAD;i&#xAD;mum di&#xAD;rect cost. High-tech&#xAD;nol&#xAD;ogy defence is used as an in&#xAD;stru&#xAD;ment of pres&#xAD;sure — on Taiwan, on South China Sea states, on Western part&#xAD;ners in trade and tech&#xAD;nol&#xAD;ogy ne&#xAD;go&#xAD;ti&#xAD;a&#xAD;tions. Pre&#xAD;serv&#xAD;ing mil&#xAD;i&#xAD;tary non-en&#xAD;gage&#xAD;ment si&#xAD;mul&#xAD;ta&#xAD;neously pro&#xAD;tects the PLA’s rep&#xAD;u&#xAD;ta&#xAD;tion from the test of re&#xAD;al&#xAD;ity.&lt;/p&gt;&lt;p&gt;This is the cen&#xAD;tral strate&#xAD;gic para&#xAD;dox of the Chi&#xAD;nese model of mil&#xAD;i&#xAD;tary AI. Non-demon&#xAD;stra&#xAD;tion of real ca&#xAD;pa&#xAD;bil&#xAD;ities is si&#xAD;mul&#xAD;ta&#xAD;neously its great&#xAD;est strate&#xAD;gic ad&#xAD;van&#xAD;tage and its great&#xAD;est struc&#xAD;tural weak&#xAD;ness. The ad&#xAD;van&#xAD;tage: the ad&#xAD;ver&#xAD;sary is forced to plan against the worst-case sce&#xAD;nario with&#xAD;out data to re&#xAD;fute it. The PLA re&#xAD;mains an “un&#xAD;known quan&#xAD;tity” — and in strate&#xAD;gic plan&#xAD;ning, un&#xAD;cer&#xAD;tainty func&#xAD;tions as a threat mul&#xAD;ti&#xAD;plier. The weak&#xAD;ness: that same un&#xAD;cer&#xAD;tainty works in both di&#xAD;rec&#xAD;tions. China it&#xAD;self does not know how its sys&#xAD;tems will hold un&#xAD;der real pres&#xAD;sure — and can&#xAD;not find out with&#xAD;out ac&#xAD;ti&#xAD;vat&#xAD;ing pre&#xAD;cisely the costs it is seek&#xAD;ing to avoid. A sys&#xAD;tem that has never been tested in com&#xAD;bat may prove ei&#xAD;ther sig&#xAD;nifi&#xAD;cantly stronger or sig&#xAD;nifi&#xAD;cantly weaker than it ap&#xAD;pears in ex&#xAD;er&#xAD;cises. Un&#xAD;til the first real en&#xAD;gage&#xAD;ment, both pos&#xAD;si&#xAD;bil&#xAD;ities are equally plau&#xAD;si&#xAD;ble.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;The Chi&#xAD;nese model is the most in&#xAD;ter&#xAD;nally con&#xAD;sis&#xAD;tent of the four ex&#xAD;am&#xAD;ined — and the most opaque. The doc&#xAD;trine is clear, the in&#xAD;sti&#xAD;tu&#xAD;tional ar&#xAD;chi&#xAD;tec&#xAD;ture is co&#xAD;her&#xAD;ent, the in&#xAD;vest&#xAD;ment is sub&#xAD;stan&#xAD;tial, the de&#xAD;clared po&#xAD;si&#xAD;tion is care&#xAD;fully cal&#xAD;ibrated. What is ab&#xAD;sent is any ex&#xAD;ter&#xAD;nal ver&#xAD;ifi&#xAD;ca&#xAD;tion mechanism, any in&#xAD;de&#xAD;pen&#xAD;dent ac&#xAD;tor ca&#xAD;pa&#xAD;ble of hold&#xAD;ing the sys&#xAD;tem to its own stated com&#xAD;mit&#xAD;ments, and any real-world test of whether the ca&#xAD;pa&#xAD;bil&#xAD;ities are what they claim to be.&lt;/p&gt;&lt;p&gt;The de&#xAD;clared sup&#xAD;port for a LAWS ban “when con&#xAD;di&#xAD;tions are ripe” and the si&#xAD;mul&#xAD;ta&#xAD;neous de&#xAD;vel&#xAD;op&#xAD;ment of sys&#xAD;tems that would not meet most pro&#xAD;posed ban crite&#xAD;ria illus&#xAD;trate the core dy&#xAD;namic: China has con&#xAD;structed a po&#xAD;si&#xAD;tion that is diplo&#xAD;mat&#xAD;i&#xAD;cally defen&#xAD;si&#xAD;ble, op&#xAD;er&#xAD;a&#xAD;tionally un&#xAD;con&#xAD;strained, and struc&#xAD;turally durable for as long as it avoids a ma&#xAD;jor con&#xAD;flict. Whether the Taiwan sce&#xAD;nario, if it ever ma&#xAD;te&#xAD;ri&#xAD;al&#xAD;ises, would val&#xAD;i&#xAD;date or dev&#xAD;as&#xAD;tate that po&#xAD;si&#xAD;tion is the ques&#xAD;tion that the Chi&#xAD;nese mil&#xAD;i&#xAD;tary it&#xAD;self can&#xAD;not an&#xAD;swer from its cur&#xAD;rent van&#xAD;tage point.&lt;/p&gt;&lt;p&gt;The next part turns to the United States — the ac&#xAD;tor with the deep&#xAD;est op&#xAD;er&#xAD;a&#xAD;tional AI in&#xAD;te&#xAD;gra&#xAD;tion, the most doc&#xAD;u&#xAD;mented cor&#xAD;po&#xAD;rate-state con&#xAD;flicts over red lines, and the first large-scale com&#xAD;bat test of AI tar&#xAD;get&#xAD;ing sys&#xAD;tems against a state ad&#xAD;ver&#xAD;sary.&lt;/p&gt;</description>
            <author>Slava Kold (Viacheslav Kolodiazhnyi)</author>
            <guid>dj4guht9a4mXu4ijG</guid>
            <pubDate>Thu, 09 Apr 2026 11:57:27 +0000</pubDate>
        </item>
        <item>
            <title>Research Associate Opportunity—Giving What We Can by Giving What We Can🔸</title>
            <link>https://forum.nunosempere.com/posts/7aXWMgwpBc6xcjLa7/research-associate-opportunity-giving-what-we-can</link>
            <description>&lt;p&gt;Help us di&#xAD;rect $80M — and even&#xAD;tu&#xAD;ally $3B — in an&#xAD;nual dona&#xAD;tions to the world’s most effec&#xAD;tive char&#xAD;i&#xAD;ties. We’re look&#xAD;ing for a Re&#xAD;search As&#xAD;so&#xAD;ci&#xAD;ate to join GWWC’s re&#xAD;search team, stress-test&#xAD;ing our eval&#xAD;u&#xAD;a&#xAD;tions and en&#xAD;sur&#xAD;ing our giv&#xAD;ing recom&#xAD;men&#xAD;da&#xAD;tions meet the high&#xAD;est stan&#xAD;dards of qual&#xAD;ity and in&#xAD;tegrity.&lt;/p&gt;&lt;p&gt;Giv&#xAD;ing What We Can is on a mis&#xAD;sion to make effec&#xAD;tive and sig&#xAD;nifi&#xAD;cant giv&#xAD;ing a cul&#xAD;tural norm — work&#xAD;ing to&#xAD;wards a world with&#xAD;out ex&#xAD;treme poverty, an&#xAD;i&#xAD;mal suffer&#xAD;ing, and ex&#xAD;is&#xAD;ten&#xAD;tial risk. Our goal: 1 mil&#xAD;lion pledgers giv&#xAD;ing $3B/​year to the high&#xAD;est-im&#xAD;pact or&#xAD;gani&#xAD;sa&#xAD;tions — and this role is cru&#xAD;cial to get&#xAD;ting there.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Ideal can&#xAD;di&#xAD;date:&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Strong an&#xAD;a&#xAD;lyt&#xAD;i&#xAD;cal and con&#xAD;cep&#xAD;tual think&#xAD;ing — com&#xAD;fortable weigh&#xAD;ing un&#xAD;cer&#xAD;tain, con&#xAD;flict&#xAD;ing evidence&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Reaches con&#xAD;sid&#xAD;ered views and com&#xAD;mu&#xAD;ni&#xAD;cates rea&#xAD;son&#xAD;ing clearly&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Metic&#xAD;u&#xAD;lous, strate&#xAD;gi&#xAD;cally pri&#xAD;ori&#xAD;tises, and cares about epistemic integrity&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;strong&gt;And ideally some of the fol&#xAD;low&#xAD;ing:&lt;/strong&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Fa&#xAD;mil&#xAD;iar&#xAD;ity with effec&#xAD;tive giv&#xAD;ing con&#xAD;cepts (cost-effec&#xAD;tive&#xAD;ness, coun&#xAD;ter&#xAD;fac&#xAD;tu&#xAD;al&#xAD;ity, im&#xAD;pact mod&#xAD;els)&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Aca&#xAD;demic or pro&#xAD;fes&#xAD;sional re&#xAD;search /​ ev&#xAD;i&#xAD;dence syn&#xAD;the&#xAD;sis experience&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;M&amp;amp;E or im&#xAD;pact as&#xAD;sess&#xAD;ment background&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Data skills (SQL, R, Python)&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Knowl&#xAD;edge of GWWC’s cause ar&#xAD;eas (global health, an&#xAD;i&#xAD;mal welfare, ex&#xAD;is&#xAD;ten&#xAD;tial risk)&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;🌍 Re&#xAD;mote, global&lt;br&gt;📅 Dead&#xAD;line: April 27th &lt;br&gt;👉 Read more and ap&#xAD;ply: &lt;a href=&quot;https://www.givingwhatwecan.org/research-associate&quot; class=&quot;bare-url&quot;&gt;https://​​www.giv&#xAD;ing&#xAD;whatwe&#xAD;can.org/​​re&#xAD;search-associate&lt;/a&gt;&lt;/p&gt;</description>
            <author>Giving What We Can🔸</author>
            <guid>7aXWMgwpBc6xcjLa7</guid>
            <pubDate>Thu, 09 Apr 2026 11:03:12 +0000</pubDate>
        </item>
        <item>
            <title>When Local Terror Reveals Global Risk: Biosecurity Lessons from Jos by Nnaemeka Emmanuel Nnadi</title>
            <link>https://forum.nunosempere.com/posts/LySFstvEZfymbPeyr/when-local-terror-reveals-global-risk-biosecurity-lessons</link>
            <description>&lt;h3&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;A re&#xAD;cent ter&#xAD;ror at&#xAD;tack in Jos high&#xAD;lights deeper failures in state ca&#xAD;pac&#xAD;ity—speci&#xAD;fi&#xAD;cally the in&#xAD;abil&#xAD;ity to de&#xAD;tect, track, and reg&#xAD;u&#xAD;late so&#xAD;phis&#xAD;ti&#xAD;cated threats. Th&#xAD;ese same in&#xAD;sti&#xAD;tu&#xAD;tional weak&#xAD;nesses ex&#xAD;tend to biose&#xAD;cu&#xAD;rity: Nige&#xAD;ria lacks ro&#xAD;bust over&#xAD;sight of high-risk re&#xAD;search, gene syn&#xAD;the&#xAD;sis, and emerg&#xAD;ing biotech&#xAD;nolo&#xAD;gies. As AI low&#xAD;ers the bar&#xAD;ri&#xAD;ers to biolog&#xAD;i&#xAD;cal in&#xAD;no&#xAD;va&#xAD;tion, the com&#xAD;bi&#xAD;na&#xAD;tion of &lt;strong&gt;low gov&#xAD;er&#xAD;nance + ris&#xAD;ing ca&#xAD;pa&#xAD;bil&#xAD;ity&lt;/strong&gt; cre&#xAD;ates a cred&#xAD;ible global risk. Biose&#xAD;cu&#xAD;rity is only as strong as its weak&#xAD;est links, and with&#xAD;out se&#xAD;ri&#xAD;ous in&#xAD;vest&#xAD;ment in surveillance, reg&#xAD;u&#xAD;la&#xAD;tion, and field-build&#xAD;ing in the global South, lo&#xAD;cal&#xAD;ized vuln&#xAD;er&#xAD;a&#xAD;bil&#xAD;ities could scale into global catas&#xAD;trophic threats&lt;/p&gt;&lt;h3&gt;Beyond the Im&#xAD;me&#xAD;di&#xAD;ate Event&lt;/h3&gt;&lt;p&gt;On Palm Sun&#xAD;day, some&#xAD;thing hap&#xAD;pened in Jos that left me so afraid and with lots of ques&#xAD;tions for biose&#xAD;cu&#xAD;rity. A ter&#xAD;ror at&#xAD;tack hap&#xAD;pened in my &lt;a href=&quot;https://religionunplugged.com/news/2026/4/2/palm-sunday-attacks-in-nigeria-christian-areas&quot;&gt;com&#xAD;mu&#xAD;nity&lt;/a&gt;. This is not the first time Jos, Plateau State, Nige&#xAD;ria, has ex&#xAD;pe&#xAD;rienced a ter&#xAD;ror at&#xAD;tack, though; you can find a chronol&#xAD;ogy of at&#xAD;tacks &lt;a href=&quot;https://plateaupeacebuilding.org/View%20timelines.php?Page=2&quot;&gt;here.&lt;/a&gt; &lt;br&gt;  &lt;/p&gt;&lt;p&gt;What made this par&#xAD;tic&#xAD;u&#xAD;lar at&#xAD;tack differ&#xAD;ent was the way and man&#xAD;ner it oc&#xAD;curred, and the fact that till date no ar&#xAD;rests have been made. This shows weak gov&#xAD;ern&#xAD;ment ca&#xAD;pa&#xAD;bil&#xAD;ity to track and hunt down bad ac&#xAD;tors in Nige&#xAD;ria. The use of &lt;a href=&quot;https://issafrica.org/iss-today/lake-chad-basin-insurgents-raise-the-stakes-with-weaponised-drones&quot;&gt;drones&lt;/a&gt; and other so&#xAD;phis&#xAD;ti&#xAD;cated weapons is in&#xAD;creas&#xAD;ingly be&#xAD;ing used in their ac&#xAD;tivi&#xAD;ties. Th&#xAD;ese at&#xAD;tacks have in&#xAD;creas&#xAD;ingly been linked to in&#xAD;ter&#xAD;na&#xAD;tional &lt;a href=&quot;https://issafrica.org/iss-today/lake-chad-basin-states-can-sever-terrorism-s-lifeline-its-financing&quot;&gt;col&#xAD;lab&#xAD;o&#xAD;ra&#xAD;tions.&lt;/a&gt; Suggest&#xAD;ing that ac&#xAD;tors are not only evolv&#xAD;ing tac&#xAD;ti&#xAD;cally but may also be benefit&#xAD;ing from cross-bor&#xAD;der net&#xAD;works and knowl&#xAD;edge trans&#xAD;fer. What ap&#xAD;pears lo&#xAD;cal may, in fact, be em&#xAD;bed&#xAD;ded in broader global sys&#xAD;tems. This then means that what may ap&#xAD;pear to be a lo&#xAD;cal prob&#xAD;lem to Jos or Nige&#xAD;ria might ac&#xAD;tu&#xAD;ally be a small part of a global agenda. The same in&#xAD;sti&#xAD;tu&#xAD;tional weak&#xAD;nesses that al&#xAD;low vi&#xAD;o&#xAD;lent ac&#xAD;tors to evade de&#xAD;tec&#xAD;tion could also en&#xAD;able the mi&#xAD;suse of biolog&#xAD;i&#xAD;cal tools. This raises an un&#xAD;der&#xAD;ex&#xAD;plored ques&#xAD;tion: What hap&#xAD;pens when the same in&#xAD;sti&#xAD;tu&#xAD;tional weak&#xAD;nesses in&#xAD;ter&#xAD;sect with rapidly ad&#xAD;vanc&#xAD;ing biolog&#xAD;i&#xAD;cal and tech&#xAD;nolog&#xAD;i&#xAD;cal ca&#xAD;pa&#xAD;bil&#xAD;ities?&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Why this mat&#xAD;ters globally&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;As a micro&#xAD;biol&#xAD;o&#xAD;gist work&#xAD;ing in Nige&#xAD;ria with in&#xAD;ter&#xAD;ests in biose&#xAD;cu&#xAD;rity and AI, I see a con&#xAD;cern&#xAD;ing gap. Nige&#xAD;ria, and many coun&#xAD;tries in the global South, cur&#xAD;rently lack the fol&#xAD;low&#xAD;ing:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Ro&#xAD;bust &lt;strong&gt;biose&#xAD;cu&#xAD;rity gov&#xAD;er&#xAD;nance frameworks&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Effec&#xAD;tive &lt;strong&gt;mon&#xAD;i&#xAD;tor&#xAD;ing of high-risk re&#xAD;search activities&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Reg&#xAD;u&#xAD;la&#xAD;tory over&#xAD;sight for &lt;strong&gt;emerg&#xAD;ing biotech&#xAD;nolo&#xAD;gies&lt;/strong&gt;, in&#xAD;clud&#xAD;ing gene syn&#xAD;the&#xAD;sis.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;There is a huge gap in poli&#xAD;cies and efforts that can keep bad ac&#xAD;tors from de&#xAD;vel&#xAD;op&#xAD;ing and de&#xAD;ploy&#xAD;ing harm&#xAD;ful tech&#xAD;nolo&#xAD;gies in Nige&#xAD;ria. Nige&#xAD;ria is vuln&#xAD;er&#xAD;a&#xAD;ble to bioter&#xAD;ror&#xAD;ism ow&#xAD;ing to a lack of de&#xAD;tec&#xAD;tion &lt;a href=&quot;https://doaj.org/article/21a420e95db54a3d8f8559852f103ab4&quot;&gt;mechanisms&lt;/a&gt;. Also, our lab&#xAD;o&#xAD;ra&#xAD;to&#xAD;ries do not have suffi&#xAD;cient reg&#xAD;u&#xAD;la&#xAD;tions as to what kinds of re&#xAD;search are al&#xAD;lowed in the lab. For ex&#xAD;am&#xAD;ple, the US congress has a policy on &lt;a href=&quot;https://www.congress.gov/crs-product/R47114&quot;&gt;gain-of-func&#xAD;tion re&#xAD;search in the labs.&lt;/a&gt; In Nige&#xAD;ria, how&#xAD;ever, our labs are not reg&#xAD;u&#xAD;lated; any&#xAD;one can fund you to do any re&#xAD;search. In prac&#xAD;tice, this means that re&#xAD;search di&#xAD;rec&#xAD;tion can be in&#xAD;fluenced with min&#xAD;i&#xAD;mal scrutiny, and de&#xAD;tec&#xAD;tion mechanisms for mi&#xAD;suse are weak.&lt;/p&gt;&lt;p&gt;Ad&#xAD;vances in AI are low&#xAD;er&#xAD;ing the bar&#xAD;ri&#xAD;ers to biolog&#xAD;i&#xAD;cal de&#xAD;sign, ex&#xAD;per&#xAD;i&#xAD;men&#xAD;tal plan&#xAD;ning, and knowl&#xAD;edge ac&#xAD;qui&#xAD;si&#xAD;tion. This com&#xAD;bi&#xAD;na&#xAD;tion, low over&#xAD;sight + in&#xAD;creas&#xAD;ing ca&#xAD;pa&#xAD;bil&#xAD;ity, cre&#xAD;ates a non-triv&#xAD;ial risk en&#xAD;vi&#xAD;ron&#xAD;ment. While a lot of efforts are con&#xAD;cen&#xAD;trated in the global north on biose&#xAD;cu&#xAD;rity, frame&#xAD;works in the global south are weak and do not sup&#xAD;port global efforts.  &lt;/p&gt;&lt;p&gt;Global catas&#xAD;trophic biolog&#xAD;i&#xAD;cal risks are not con&#xAD;strained by ge&#xAD;og&#xAD;ra&#xAD;phy. Risk is de&#xAD;ter&#xAD;mined not by the strongest sys&#xAD;tems but by the &lt;strong&gt;weak&#xAD;est reg&#xAD;u&#xAD;la&#xAD;tory and surveillance en&#xAD;vi&#xAD;ron&#xAD;ments&lt;/strong&gt;. A failure in one re&#xAD;gion can prop&#xAD;a&#xAD;gate globally. Yet, most biose&#xAD;cu&#xAD;rity in&#xAD;vest&#xAD;ments, tal&#xAD;ent pipelines, and gov&#xAD;er&#xAD;nance frame&#xAD;works re&#xAD;main con&#xAD;cen&#xAD;trated in the global North.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;What do I think needs to be done:&lt;/strong&gt;&lt;/h3&gt;&lt;ol&gt;&lt;li&gt;&lt;p&gt;In&#xAD;crease field-build&#xAD;ing efforts to bring more peo&#xAD;ple into the field of biose&#xAD;cu&#xAD;rity in the global south. Build&#xAD;ing field-build&#xAD;ing efforts among uni&#xAD;ver&#xAD;sity stu&#xAD;dents would be an im&#xAD;pact&#xAD;ful path&#xAD;way.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;In&#xAD;crease dis&#xAD;ease surveillance efforts. Me&#xAD;tage&#xAD;nomics efforts can be ramped up&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;In&#xAD;crease coun&#xAD;ter&#xAD;mea&#xAD;sures efforts such as far-UVC, PPEs, stock&#xAD;piling of an&#xAD;tivirals, and di&#xAD;ag&#xAD;nos&#xAD;tic test&#xAD;ing.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;In&#xAD;crease poli&#xAD;cies that can reg&#xAD;u&#xAD;late re&#xAD;search in our uni&#xAD;ver&#xAD;si&#xAD;ties. &lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Global catas&#xAD;trophic biolog&#xAD;i&#xAD;cal risks are only as con&#xAD;tained as the weak&#xAD;est reg&#xAD;u&#xAD;la&#xAD;tory en&#xAD;vi&#xAD;ron&#xAD;ments. We must work to&#xAD;gether to strengthen our weak&#xAD;est links as we push for&#xAD;ward to build&#xAD;ing a sus&#xAD;tain&#xAD;able fu&#xAD;ture. &lt;/p&gt;&lt;h3&gt; Clos&#xAD;ing insight&lt;/h3&gt;&lt;p&gt;Global catas&#xAD;trophic biolog&#xAD;i&#xAD;cal risks are only as con&#xAD;tained as the weak&#xAD;est reg&#xAD;u&#xAD;la&#xAD;tory en&#xAD;vi&#xAD;ron&#xAD;ments. We must work to&#xAD;gether to strengthen our weak&#xAD;est links as we push for&#xAD;ward to build a sus&#xAD;tain&#xAD;able fu&#xAD;ture. If ad&#xAD;vanced biolog&#xAD;i&#xAD;cal risks are to be man&#xAD;aged effec&#xAD;tively, &lt;strong&gt;global co&#xAD;or&#xAD;di&#xAD;na&#xAD;tion must in&#xAD;clude mean&#xAD;ingful ca&#xAD;pac&#xAD;ity-build&#xAD;ing in re&#xAD;gions cur&#xAD;rently un&#xAD;der-reg&#xAD;u&#xAD;lated. &lt;/strong&gt;Other&#xAD;wise, we risk build&#xAD;ing highly se&#xAD;cure sys&#xAD;tems in some parts of the world while leav&#xAD;ing oth&#xAD;ers struc&#xAD;turally ex&#xAD;posed. And in a do&#xAD;main like biose&#xAD;cu&#xAD;rity, &lt;strong&gt;ex&#xAD;po&#xAD;sure any&#xAD;where is ex&#xAD;po&#xAD;sure ev&#xAD;ery&#xAD;where.&lt;/strong&gt;&lt;/p&gt;</description>
            <author>Nnaemeka Emmanuel Nnadi</author>
            <guid>LySFstvEZfymbPeyr</guid>
            <pubDate>Thu, 09 Apr 2026 01:47:20 +0000</pubDate>
        </item>
    </channel>
</rss>