注册并分享邀请链接,可获得视频播放与邀请奖励。

Wei Cai 🎮 (@weicaiuw) “layered llm?” — TopicDigg

Wei Cai 🎮 的个人资料封面
Wei Cai 🎮 的头像
Wei Cai 🎮
@weicaiuw
css assistant professor @uw; decentralized computing (crypto + DeAI) researcher @SIGCHI @SIGMM @ACM_SIGWEB; indie game dev @HuluCatsGames
加入 April 2009
767 正在关注    3K 粉丝
layered llm?
In this paper, a 7B language model trained with reinforcement learning learns to orchestrate larger frontier models like GPT-5, Claude Sonnet 4, and Gemini 2.5 Pro. It does so by writing natural-language subtasks, assigning each to one of the workers, and specifying which previous outputs that worker sees in context. The resulting system outperforms every individual frontier model on benchmarks including GPQA Diamond, LiveCodeBench, and AIME25, while averaging about three model calls per question—fewer than the multi-agent pipelines and self-reflection loops it beats. The work provides evidence that prompt engineering and pipeline design, currently done by hand in commercial AI products, can be learned end-to-end through reward signals alone. Read with an AI tutor: PDF:
显示更多