注册并分享邀请链接,可获得视频播放与邀请奖励。

Jiayi Weng 的个人资料封面
Jiayi Weng 的头像

Jiayi Weng (@Trinkle23897)

@Trinkle23897
MTS @openai, author of the entire post-training RL infra, core contributor of ChatGPT/GPT4/GPT4o etc. 30U30
181 正在关注    11.8K 粉丝
Very exciting to see the cool result! Same pattern, different physics: Codex-grown heuristics matching or beating DRL agents in fluid dynamics, while staying readable, maintainable, and transferable. Heuristics were not dead. They were under-maintained.
显示更多
0
1
175
29
转发到社区
Codex grew programmatic policies with no neural nets: max score on Breakout, and SOTA-level scores on MuJoCo. Maybe heuristics were not too weak. Maybe they were just too expensive to maintain. Maybe it's the next paradigm.
显示更多
0
59
1.4K
229
转发到社区