注册并分享邀请链接,可获得视频播放与邀请奖励。

Andrej Karpathy (@karpathy) “Remember the llm.c repro of the GPT-2 (124M) training run? It took 45 min on 8xH” — TopicDigg

Andrej Karpathy 的个人资料封面
Andrej Karpathy 的头像
Andrej Karpathy
@karpathy
I like training large deep neural nets. MTS @ Anthropic. Previously Director of AI @ Tesla, founding team @ OpenAI, PhD @ Stanford.
加入 April 2009
1.1K 正在关注    2.7M 粉丝
Remember the llm.c repro of the GPT-2 (124M) training run? It took 45 min on 8xH100. Since then, @kellerjordan0 (and by now many others) have iterated on that extensively in the new modded-nanogpt repo that achieves the same result, now in only 5 min! Love this repo 👏 600 LOC
显示更多
0
50
4.2K
391
转发到社区