Phoenix Yin (@Phoenixyin13) “💥人类的精力将从琐碎中彻底解放出来！ DeepSeek 资深研究员Deli Chen开源的这个 Deli”

2026.06.19 03:04

💥人类的精力将从琐碎中彻底解放出来！ DeepSeek 资深研究员Deli Chen开源的这个 Deli AutoResearch SKILL 项目，它所展示的全自动科研闭环和AI 自我博弈的能力，给我带来了巨大的震撼。通过规范化长周期任务的逻辑、设定Anti-Loop和Heartbeat Watchdog等元规则，大模型自己就能充当编译器和执行器。未来的程序员，更多是在扮演立法者与流程架构师的角色。根据Deli所言，AI 通过自我博弈在没有人类干预的情况下自主规划 GPU 实验并通过 GRPO 算法进行强化学习，最终在模拟同行评审中拿到 8.6/10 的高分。这说明 AI 正在从学习人类现有的知识跨越到通过自我试错去探索人类未知的知识边界。科学研究的效率可能会迎来指数级爆炸。近几个月，许多人在调侃现阶段的 AI Agent 只能做几步简单的任务，稍微时间长一点就会迷失自我或者陷入死循环。 Deli的解法非常具有工业参考价值。他把传统分布式系统、操作系统里的概念，比如 Watchdog、持久化、多角色模拟，搬到了 Agent 协议里。让 Agent 真正干大事，尤其连续工作 10 小时、迭代 60 轮这种情况，必须引入成熟的工程防御机制，去对抗 AI 的随机性和幻觉。 AI-Native的科研路径不仅可行，而且已经跑通。

显示更多

Deli Chen@victor207755822

2026.06.17 14:52

🧵 Deli AutoResearch SKILL is now officially open source! 🎉 Alongside it, we’re dropping our 4th survey paper — this time on Self-play. Inspired by AlphaZero, we got a powerful insight: prior knowledge doesn’t always lift the ceiling. Models can discover more globally optimal solutions just by playing against themselves. The biggest change in this paper? For the first time, the AutoResearch Agent autonomously planned GPU experiments — and submitted actual RL runs on the DeepSeek 285B model. The entire RL pipeline — experiment design, code writing, running, debugging, and conclusion summarization — was 100% automated, with zero human intervention from me. This was incredibly difficult, but an incredibly important step. GRPO is the tool being called by the AutoResearch Agent here. We see this as the beginning of our Continual Learning research journey. 🚀 As always, this is my personal research project, unaffiliated with any organization. All views are my own. #AI# #ReinforcementLearning# #SelfPlay# #OpenSource# #AutoML# #ContinualLearning# #DeepSeek#

显示更多