注册并分享邀请链接,可获得视频播放与邀请奖励。

Andrej Karpathy (@karpathy) “ChatGPT "Advanced Data Analysis" (which doesn't really have anything to do with” — TopicDigg

Andrej Karpathy 的个人资料封面
Andrej Karpathy 的头像
Andrej Karpathy
@karpathy
I like training large deep neural nets.
加入 April 2009
1.1K 正在关注    3M 粉丝
ChatGPT "Advanced Data Analysis" (which doesn't really have anything to do with data specifically) is an awesome tool for creating diagrams. I could probably code these diagrams myself, but it's soo much better to just sit back, and iterate in English. In this example, I was experimenting with a possible diagram to explain Supervised Finetuning in LLMs. The "document" at the origin (0,0) is the empty document, and eminating outwards are token streams. Highlighted in black are the high probability token streams of the base model. In red are the token streams corresponding to the conversational finetuning data. When we finetune, we are increasing the probabilities of the red paths and suppressing the black paths. I like this view because it emphasizes LLMs as "token simulators", with their own kind of statistical physics backed by datasets, bouncing around in the discrete token space. The conversation where we built it in a few minutes: (Sadly I just remembered that ChatGPT sharing doesn't support images, but at least the text is there, of me iterating with the diagram in plain language, and needing to touch no code. Such a vibe of the future.) I had a similar experience yesterday, was trying to create a plot that shows smoothing in n-gram language models. Again I could just have coded this manually, but this was 10X faster and so easy. Conversation: Posting because during these chats I was struck again by that feeling of what must be the future, where you just sit back and say stuff, and the computer is doing the hard work. And in some narrow pockets of tasks, you can already get that feeling today.
显示更多
0
118
4.8K
630
转发到社区