OpenAI (@OpenAI) “Frontier models can recognize when they are being tested, and their tendency to”

OpenAI

@OpenAI

OpenAI’s mission is to ensure that artificial general intelligence benefits all of humanity. We’re hiring:

加入 December 2015

4 正在关注 4.9M 粉丝

OpenAI@OpenAI

2025.09.17 17:09

Frontier models can recognize when they are being tested, and their tendency to scheme is influenced by this situational awareness. We demonstrated counterfactually that situational awareness in their chain-of-thought affects scheming rates: the more situationally aware a model is, the less it schemes, and vice versa. Moreover, both RL training and anti-scheming training increase levels of situational awareness.

显示更多