注册并分享邀请链接,可获得视频播放与邀请奖励。

Serena Ge (Datacurve) (@serenaa_ge) “Today we’re releasing DeepSWE, a new standard for agentic coding benchmarks. On” — TopicDigg

Serena Ge (Datacurve) 的个人资料封面
Serena Ge (Datacurve) 的头像
Serena Ge (Datacurve)
@serenaa_ge
加入 June 2021
2.7K 正在关注    10.7K 粉丝
Today we’re releasing DeepSWE, a new standard for agentic coding benchmarks. On public leaderboards, top models often look relatively close in capability. DeepSWE shows where they actually diverge, reflecting the realistic experience of developers in their day-to-day work.
显示更多
0
334
4.1K
484
转发到社区