注册并分享邀请链接,可获得视频播放与邀请奖励。

OpenAI (@OpenAI) “In this new research with @apolloaievals, we found behaviors consistent with sch” — TopicDigg

OpenAI 的个人资料封面
OpenAI 的头像
OpenAI
@OpenAI
OpenAI’s mission is to ensure that artificial general intelligence benefits all of humanity. We’re hiring:
加入 December 2015
4 正在关注    4.9M 粉丝
In this new research with @apolloaievals, we found behaviors consistent with scheming in controlled tests across frontier models, including OpenAI o3 and o4-mini, Gemini-2.5-pro, and Claude Opus-4. We can significantly reduce scheming by training models to reason explicitly, using an extension to the Model Spec that prohibits scheming. That method is called deliberative alignment. With this technique, we can reduce covert actions by 30x for o3. However, situational awareness complicates results. Model spec:
显示更多
0
2
295
18
转发到社区