lifcc (@mylifcc) “【小模型再次震撼！3B参数VibeThinker-3B数学推理直逼前沿】 Weibo AI刚放出的VibeThi”

2026.06.16 14:51

【小模型再次震撼！3B参数VibeThinker-3B数学推理直逼前沿】 Weibo AI刚放出的VibeThinker-3B（基于Qwen2.5-Coder），在AIME26拿到94.3（+CLR后97.1），LiveCodeBench v6 80.2，LeetCode近赛96.1%通过率……这成绩直接干翻一堆远大于它的模型。核心是Spectrum-to-Signal后训练策略：多样性蒸馏 + RL优化，专注verifiable reasoning，没走通用知识堆参数的老路。这类高效小模型太香了——本地跑得动、成本低、推理强，特别适合数学/算法密集的子任务。 HuggingFace: 小模型时代真的来了，参数不是万能的，post-training才是王道。你们觉得这个3B能直接上生产吗？

显示更多

Francesco Bertolotti@f14bertolotti

2026.06.16 05:20

Stellar performance from a 3B model. These results were achieved primarily through post-training refinements on Qwen2.5-Coder. The paper doesn't provide many details, but it appears they distill from RL ckpts and then do a final RL-based instruct RL. 🔗

显示更多