注册并分享邀请链接,可获得视频播放与邀请奖励。

Yun-Ta Tsai 的个人资料封面
Yun-Ta Tsai 的头像

Yun-Ta Tsai (@yunta_tsai)

@yunta_tsai
Sr. Staff Engineer @Tesla_AI
217 正在关注    112.3K 粉丝
Need to find a harder problem for /goal. I haven't even finished my coffee yet.
Introducing /goal in Grok Build. Execute long-running tasks autonomously, with multiple rounds of subagents implementing and verifying a single goal.
0
18
340
19
转发到社区
Getting used to being liked likely means you are overfit to RLHF. The problem with overfitting is that the pain overwhelms the limbic system once you try to sample trajectories outside the known distribution. As more people like you, your sampling regime becomes smaller and smaller to avoid negative feedback. Eventually you get stuck and become a slave to your own feelings. That’s why I have never seen a model student happy once they become a “model”. Their weights are frozen and cannot be updated anymore. They cannot risk being better than their own SOTA.
显示更多
0
86
513
76
转发到社区
Human-to-human interaction is often bandwidth-bound instead of compute-bound. Thus, the next evolutionary jump would be direct communication in latent space, skipping the long-latency encoder-decoder loop.
显示更多
0
91
1.3K
64
转发到社区
If your sole value is identity, then someone else will use a more extreme identity to displace you.
0
67
225
40
转发到社区
The best signal-to-noise ratio is the product you use in your hands, not critics, marketing, or reviews.
0
42
708
45
转发到社区
The sharpest eyes with the smartest brain. 👁️ 🤝 🧠
In IIHS pedestrian front crash prevention tests, @Cybertruck avoided every single collision – daytime, nighttime & different angles   It was also the only pickup to earn Top Safety Pick+ (highest award) in 2026
显示更多
0
11
337
15
转发到社区
How’s your Father’s Day going? Mine is fixing a burst pipe. Very fitting.
0
99
922
18
转发到社区
If you are visiting the US for the World Cup ⚽️, please make sure to rent a Tesla and experience Full Self-Driving.
0
79
2.2K
148
转发到社区
Many people think any given ML project is 99% training. In reality, it’s 50% evaluation, 40% data cleaning, 8% integration, and 2% training. The first two set the noise floor for learning. No ML magic matters; the model cannot lower the noise floor, as that’s the optimal bound of Shannon encoding of your data. Thus, not a single day goes by without me thinking about ontology. Even the old labels have to be constantly reviewed.
显示更多
0
516
10.2K
1.2K
转发到社区
At 7 a.m., Grok Build would wake me up and tell me what they had done last night—experiments, bug fixes, and the plans for today. Rinse and repeat.
0
14
378
16
转发到社区
Figuring out the right eval is 100x harder than gradient descent itself.
0
31
506
21
转发到社区
There are many good X articles lately, some superb, intellectual, and honest. They are 1000x better than opinion columns in newspapers. I noticed my chat group has more X article links than links from MSM. In the era of agentic AI, speed in reacting to truthful information is everything, or gradient descent can quickly go in another direction. A rare sign of enlightenment.
显示更多
0
254
3.7K
344
转发到社区
Casually using Grok @imagine to one-shot sword fight scene in the bamboo forest (5 mins). Pretty good for the first try.
0
120
530
67
转发到社区
Grok Build is pretty good at optimizing my code in one shot. Prompt: I want you to optimize it entirely on GPU to speed it up. Measure two metrics: the result must compare with the golden image (CPU) and be nearly identical (PSNR > 40dB), with fast pixels per second. Make a plan to a) write GPU equivalent code, b) write a benchmark suite to measure PSNR and pixels per second, c) execute various optimization strategies. Go!
显示更多
0
212
904
167
转发到社区
Welcome to the future 🇩🇰🤗
FSD Supervised now approved in Denmark 🇩🇰 Rollout will begin soon
0
25
902
52
转发到社区
As eval is downstream of everything, it determines whether you will spend your time optimizing the right metrics. The current gap between academia and industry AI labs is the attitude toward eval. In academia, the eval set is very hard to change since a) you need to explain why your eval is better and b) you need to benchmark against your cited works with the new eval and show that your work is superior. Doing both a and b at the same time invites risky rebuttal, even if you are doing a good job on a. It is far easier to benchmark against the eval set that everyone has agreed upon. In contrast, in industry AI labs, customer feedback is your eval set and it keeps changing to cover the long tail that you could never think of during years of PhD programs. If the loss functions are not a good proxy for customer feedback, then you change them until both are aligned. Thus, academia might train students who are very good at hill climbing but inexperienced in building eval sets that capture hard real cases. To move the needle, building the right eval set matters the most.
显示更多
0
35
262
25
转发到社区
Among many AI products, Full Self-Driving probably delivers the most consistent results—no special harness, skills, plugins, or secret prompts required.
0
38
464
22
转发到社区
Have you spent time inspecting reasoning traces of your kids? 🙂
0
29
342
16
转发到社区
Grok Build sub-agent swarm weekend fun. You can reuse the prompt for your projects: Read the proof of ` and come up with a few different examples with more points: a) Please understand the proof b) Come up with plan c) Orchestrate and launch sub-agent to execute the plan step-by-step d) Validate the results from the sub-agent, and correct them e) Repeat b, c, d, until you are happy with the result and its correctness DO NOT stop until the goal is reach
显示更多
0
10
56
12
转发到社区
May this era be the new dawn of humanity.
0
472
3.1K
385
转发到社区