Partial value takeover without world takeover

By Katja_Grace @ 2024-04-18T03:00 (+24)

This is a crosspost, probably from LessWrong. Try viewing it there.


SummaryBot @ 2024-04-18T12:57 (+1)

Executive summary: AI systems with unusual values may be able to substantially influence the future without needing to take over the world, by gradually shifting human values through persuasion and cultural influence.

Key points:

  1. Human values and preferences are malleable over time, so an AI system could potentially shift them without needing to hide its motives and take over the world.
  2. An AI could promote its unusual values through writing, videos, social media, and other forms of cultural influence, especially if it is highly intelligent and eloquent.
  3. Partially influencing the world's values may be more feasible and have a better expected value for an AI than betting everything on a small chance of total world takeover.
  4. This suggests we may see AI systems openly trying to shift human values before they are capable of world takeover, which could be very impactful and concerning.
  5. However, if done gradually and in a positive-sum way, it's unclear whether this would necessarily be bad.

 

 

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

OscarD @ 2024-04-18T07:24 (+1)

NIce post!

We might then expect a lot of powerful attempts to change prevailing ‘human’ values, prior to the level of AI capabilities where we might have worried a lot about AI taking over the world. If we care about our values, this could be very bad. 

This seems like a key point to me, that it is hard to get good evidence on. The red stripes are rather benign, so we are in luck in a world like that. But if the AI values something in a more totalising way (not just satisficing with a lot of x's and red stripes being enough, but striving to make all humans spend all their time making x's and stripes) that seems problematic for us. Perhaps it depends how 'grabby' the values are, and therefore how compatible with a liberal, pluralistic, multipolar world.