AI for epistemics: the good, the bad and the ugly

By Forethought, Owen Cotton-Barratt, rosehadshar @ 2026-04-13T17:17 (+22)

This is a linkpost to https://www.forethought.org/research/ai-impacts-on-epistemics-the-good-the-bad-and-the-ugly

Intro

For better or worse, AI could reshape the way that people work out what to believe and what to do. What are the prospects here?

In this piece, we’re going to map out the trajectory space as we see it. First, we’ll lay out three sets of dynamics that could shape how AI impacts epistemics (how we make sense of the world and figure out what’s true):

Then we’ll argue that feedback loops could easily push towards much better or worse epistemics than we’ve seen historically, making near-term work on AI for epistemics unusually important.

The stakes here are potentially very high. As AI advances, we’ll be faced with a whole raft of civilisational-level decisions to make. How well we’re able to understand and reason about what’s happening could make the difference between a future that we’ve chosen soberly and wisely, and a catastrophe we stumble into unawares.

The good

“If I have seen further, it is by standing on the shoulders of giants.” (Isaac Newton)

There are lots of ways that AI could help improve epistemics. Many kinds of AI tools could directly improve our ability to think and reason. We’ve written more about these in our design sketches, but here are some illustrations:

Structurally, AI progress might also enable better reasoning and understanding, for example by automating labour such that people have more time and attention, or by making people wealthier and healthier.

These changes might enable us to approach something like epistemic flourishing, where it’s easier to find out what’s true than it is to lie, and the world in most people’s heads is pretty similar to the world as it actually is. This could radically improve our prospects of safely navigating the transition to advanced AI, by:

A Philosopher Lecturing on the Orrery, a painting by Joseph Wright of Derby. It depicts a lecturer giving a demonstration of an orrery – a mechanical model of the Solar System – to a small audience.
A Philosopher Lecturing on the Orrery, by Joseph Wright of Derby (1766)

What’s driving these potential improvements?

The bad

“A wealth of information creates a poverty of attention.” (Herbert Simon)

AI could also make epistemics worse without anyone intending it, by making the world more confusing and degrading our information and processing.

There are a few different ways that AI could unintentionally weaken our epistemics:

Allegory of Error by Stefano Bianchetti. An engraving depicting a blindfolded figure with donkey ears staggering forward holding a staff.
Allegory of Error, Stefano Bianchetti (1801)

The ugly

“The ideal subject of totalitarian rule is not the convinced Nazi or the convinced Communist, but people for whom the distinction between fact and fiction (i.e., the reality of experience) and the distinction between true and false (i.e., the standards of thought) no longer exist.” (Hannah Arendt, The Origins of Totalitarianism)

We’ve just talked about ways that AI could make epistemics worse without anyone intending that. But we might also see actors using AI to actively interfere with societal epistemics. (In reality these things are a spectrum, and the dynamics we discussed in the preceding section could also be actively exploited.)

What might this look like?

The Card Sharp with the Ace of Diamonds, an oil-on-canvas painting by Georges de La Tour. It depicts a card game in which a young man is being fleeced of his money by the other players, including a card sharp who is retrieving the ace of diamonds from behind his back.
The Card Sharp with the Ace of Diamonds, by Georges de La Tour (~1636-1638)

But maybe this is all a bit paranoid. Why expect this to happen?

There’s a long history of powerful actors trying to distort epistemics,[1] so we should expect that some people will be trying to do this. And AI will probably give them better opportunities to manipulate other people’s epistemics than have existed historically:

It’s also worth noting that many of these abuses of epistemic tech don’t require people to have some Machiavellian scheme to disrupt epistemics or seek power for themselves (though these might arise later). Motivated reasoning could get you a long way:

So what should we expect to happen?

With all these dynamics pulling in different directions, should we expect that it’s going to get easier or harder for people to make sense of the world?

We think it could go either way, and that how this plays out is extremely consequential.

The main reason we think this is that the dynamics above are self-reinforcing, so the direction we set off in initially could have large compounding effects. In general, the better your reasoning tools and information, the easier it is for you to recognise what is good for your own reasoning, and therefore to improve your reasoning tools and information. The worse they are, the harder it is to improve them (particularly if malicious actors are actively trying to prevent that).

We already see this empirically. The Scientific Revolution and the Enlightenment can be seen as examples of good epistemics reinforcing themselves. Distorted epistemic environments often also have self-perpetuating properties. Cults often require members to move into communal housing and cut contact with family and friends who question the group. Scientology frames psychiatry’s rejection of its claims as evidence of a conspiracy against it.

And on top of historical patterns, there are AI-specific feedback loops that reinforce initial epistemic conditions:

There are self-correcting dynamics too, so these self-reinforcing loops won’t go on forever. But we think it’s decently likely that epistemics get much better or much worse than they’ve been historically:

Given the real chance that we end up stuck in an extremely positive or negative epistemic equilibrium, our initial trajectory seems very important. The kinds of AI tools we build, the order we build them in, and who adopts them when could make the difference between a world of epistemic flourishing and a world where everyone’s understanding is importantly distorted. To give a sense of the difference this makes, here’s a sketch of each world (among myriad possible sketches):

The world we end up in is the world from which we have to navigate the intelligence explosion, making decisions like how to manage misaligned AI systems, whether to grant AI systems rights, and how to divide up the resources of the cosmos. How AI impacts our epistemics between now and then could be one of the biggest levers we have on navigating this well.

Things we didn’t cover

Whose epistemics?

We mostly talked about AI impacts on epistemics in general terms. But AI could impact different groups’ epistemics differently — and different groups’ epistemics could matter more or less for getting to good outcomes. It would be cool to see further work which distinguishes between scenarios where good outcomes require:

‘Weird’ dynamics

We focused on how AI could impact human epistemics, in a world where human reasoning still matters. But eventually, we expect more and more of what matters for the outcomes we get will come down to the epistemics of AI systems themselves.

The dynamics which affect these AI-internal epistemics could therefore be enormously important. But they could look quite different from the human-epistemics dynamics that have been our focus here, and we didn’t think it made sense to expand the remit of the piece to cover these.

Thanks to everyone who gave comments on drafts, and to Oly Sourbut and Lizka Vaintrob for a workshop which crystallised some of the ideas.

This article was created by Forethought. Read the original on our website.

  1. ^

    Think of things like:

    • Propaganda states like Nazi Germany and the USSR.
    • Corporate lobbying like the tobacco and sugar lobbies and climate science doubt campaigns.
    • CIA operations to spread doubt and confusion.
  2. ^

    Though it’s possible that this dynamic will be more pronounced for epistemics getting extremely bad than for them getting extremely good. Consider these two very simplistic sketches:

    1. People start living in increasingly closed AI filter bubbles. Institutions are slow to adopt similar bubbles at a corporate level, but they also don’t have a mandate to change what their employees are doing. People’s filter bubbles tend to be pretty correlated with the people they work and interact with, so institutions end up with pretty distorted pictures of what’s going on even though they don’t actively start using harmful tech. Government regulation is too slow and reactive to stop this from happening.
    2. People start to use provenance tracing and rhetoric highlighting by default when browsing, in response to an increasingly polarised memetic environment. There is adaptation to this — politicians start using subtler language and so on. But the net effect is still strongly positive: it’s hard to fake provenance, and removing overt rhetoric is already a big win, even if it means that more slippery language proliferates.

    In the first sketch, it’s straightforwardly the case that adaptive mechanisms are too slow. In the latter, it’s more that the tech is inherently defence-favoured.

    We haven’t explored this area deeply, and think more work on this would be valuable.

  3. ^

    Alternatively, these elites might retain very good epistemics for themselves, and choose to indefinitely maintain a situation where everyone else has a very distorted understanding, to further their own ends. It’s unclear to us which of these scenarios is more likely or concerning.


Oliver Sourbut @ 2026-04-14T09:00 (+5)

I appreciate this discussion a lot. Two things which stand out to me as deserving more emphasis.

First though, quickly framing 'good epistemic outcomes' as something like a product of 'people trying to understand clearly' and 'people can do that effectively'. (Of course these are interrelated, because people's willingness is obviously affected by the practicalities - more on that in point 2.)

OK, the things:

  1. It looks to me like most of the object-level task of collective epistemics is the checking and generally piecing together good 'secondary research' (broadly construed). i.e. looking at provenance, tracking the evidence and reasoning dependencies for a claim, proactively gathering the best arguments for and against, reasons to downweight certain testimony etc.

    • Why? Almost all our information about our environment beyond our direct sensory access is mediated through highly iterated message passing, reinterpretation, aggregation, and so on - especially in the heights of science and the depths (!) of political/influence goings-on
    • AI enables this (The Good) not so much (directly) by 'knowing' more or having 'more insights', but rather by hugely expanding the availability of clerical checking, tracing, and knowledge mapping work!
    • You kind of talk about this in the collective epistemics discussion, but I think it warrants more
  2. Most of the overall task of collective epistemics may be in the motivating i.e. having more people more of the time actually trying to understand things with accuracy, rather than retreating into one or other alternative cognitive mode

    • The usual label I use for alternative cognitive modes is 'tribal cognition', where most of what's said and recounted (and even believed), especially (but not even only) about what's outside of the immediate sensory environment, is in service of building and maintaining allegiances and coalitions
    • When is 'tribal cognition' incentivised? I don't fully know, but it has to do with
      • When people are/feel threatened, they reach for affiliations which offer (perhaps passing or merely apparent) security
        • Abusers can play on this by a combination of bigging up threats and presenting as a effective and sympathetic
      • When the epistemic environment is difficult true perception is more difficult and less rewarded
        • Abusers can push this. In politics: flood the zone, firehose of falsehoods, FUD. In science: p-hacking, importance-hacking, conflating/obscuring methodologies.
      • Generally adding noise and more convincing fake content undermines The Good above, the ability to check and trace, not by making people believe the fake stuff but by making them correctly recognise that it's hard to tell at all (thus 'retreat')
      • Certain coalition norms can encourage epistemic insularity and discourage (genuine) scrutiny
    • I think you're touching on this in The Ugly, 'undermine sense-making'. To me it's possibly 'most of the problem'! Or at least, understanding under what conditions people mobilise one or other cognitive intents in sensemaking, and how those conditions can be influenced is a really big part of the picture here.
Slava Kold (Viacheslav Kolodiazhnyi) @ 2026-04-16T18:54 (+1)

The distinction you draw here seems important and underexplored. AI is genuinely valuable in that it reduces the cost of routine work, freeing up time and energy for new ideas. But when it comes to verificatory routine – the kind you describe, whose output becomes the foundation for further epistemic conclusions – this automation carries a specific risk.

The tasks that get delegated first are the ones people least want to do. Searching for sources, tracing chains of reasoning, cross-checking claims – these are the classic examples. It is psychologically easiest to hand off what you dislike, especially when the tool handles it faster and more smoothly. But this same monotonous routine is what builds an intuitive understanding of how these processes work from the inside – what looks suspicious, where errors tend to hide, when a source is too convenient to be genuine.

When a person stops doing this work, they lose not just the skill but the ability to validate what the algorithm produces. And precisely because the task is disliked, there is little motivation to maintain any kind of checking mode. The value of automation and its vulnerability turn out to be in the same place: the more readily a person delegates a task, the less capable they are of noticing when the algorithm gets it wrong.

Do you think maintaining deliberate checking habits is enough to offset this, or is the risk more structural?

Oliver Sourbut @ 2026-04-14T08:42 (+1)

It's 'Sourbut' (one 't', thankfully!)

rosehadshar @ 2026-04-14T11:13 (+4)

Fixed, sorry!