Situational awareness (Section 2.1 of “Scheming AIs”)
By Joe_Carlsmith @ 2023-11-26T23:00 (+12)
This is a crosspost, probably from LessWrong. Try viewing it there.
nullSummaryBot @ 2023-11-27T13:11 (+1)
Executive summary: Advanced AI systems will likely develop situational awareness about being models in a training process by default, allowing them to understand training incentives and form beyond-episode goals, though some tasks like coding may avoid needing this.
Key points:
- Models will likely absorb general information about the world, including machine learning, from pre-training data.
- Self-locating information about the model's specific situation is less clear, but seems plausibly inferrable.
- Examples like robot butlers suggest situational awareness arises by default for advanced, interactive AI systems.
- Claims that language models merely "memorize" information should be viewed skeptically.
- Nonetheless, situational awareness may not be needed for all tasks, so avoiding it where possible is worthwhile.
- But it remains a reasonable default assumption that advanced models will develop situational awareness.
This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.