Some quick thoughts on "AI is easy to control"

By MikhailSamin @ 2023-12-07T12:23 (+5)

This is a crosspost, probably from LessWrong. Try viewing it there.

null
SummaryBot @ 2023-12-07T13:14 (+2)

Executive summary: The post argues controlling AI systems will be easy but misses key issues around aligning superintelligent systems.

Key points:

  1. The post misrepresents concerns about AI safety as just loss of control, while the core issue is misalignment.
  2. Evidence of controlling subhuman systems doesn't readily transfer to controlling superhuman AI.
  3. Optimization techniques shape behavior but don't necessarily instill human values as a goal.
  4. Techniques for manipulating subhuman systems likely won't work on superintelligent systems.
  5. Learning human values doesn't automatically make an AI adopt them as optimisation targets.
  6. Current language models can't evaluate plans of superintelligent systems.

 

 

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.