Some quick thoughts on "AI is easy to control"
By MikhailSamin @ 2023-12-07T12:23 (+5)
This is a crosspost, probably from LessWrong. Try viewing it there.
nullSummaryBot @ 2023-12-07T13:14 (+2)
Executive summary: The post argues controlling AI systems will be easy but misses key issues around aligning superintelligent systems.
Key points:
- The post misrepresents concerns about AI safety as just loss of control, while the core issue is misalignment.
- Evidence of controlling subhuman systems doesn't readily transfer to controlling superhuman AI.
- Optimization techniques shape behavior but don't necessarily instill human values as a goal.
- Techniques for manipulating subhuman systems likely won't work on superintelligent systems.
- Learning human values doesn't automatically make an AI adopt them as optimisation targets.
- Current language models can't evaluate plans of superintelligent systems.
This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.