Andrej Karpathy (@karpathy) “(I cycle through all LLMs over time and all of them seem to do this so it's not”

Andrej Karpathy

@karpathy

I like to train large deep neural nets. Previously Director of AI @ Tesla, founding team @ OpenAI, PhD @ Stanford.

加入 April 2009

1.1K 正在关注 2.5M 粉丝

Andrej Karpathy@karpathy

2026.03.25 16:22

(I cycle through all LLMs over time and all of them seem to do this so it's not any particular implementation but something deeper, e.g. maybe during training, a lot of the information in the context window is relevant to the task, so the LLMs develop a bias to use what is given, then at test time overfit to anything that happens to RAG its way there via a memory feature (?))

显示更多

152

1.7K

转发到社区

热门用户