注册并分享邀请链接,可获得视频播放与邀请奖励。

Andrej Karpathy (@karpathy) “The non-obvious crux of the shift is an empirical finding, emergent only at scal” — TopicDigg

Andrej Karpathy 的个人资料封面
Andrej Karpathy 的头像
Andrej Karpathy
@karpathy
I like training large deep neural nets.
加入 April 2009
1.1K 正在关注    3M 粉丝
The non-obvious crux of the shift is an empirical finding, emergent only at scale, and well-articulated in the GPT-3 paper ( Basically, Transformers demonstrate the ability of "in-context" learning. At run-time, in the activations. No weight updates.
显示更多
0
5
221
19
转发到社区