Yun-Ta Tsai (@yunta_tsai)

2026.06.25 04:06

Need to find a harder problem for /goal. I haven't even finished my coffee yet.

2026.06.22 16:28

Introducing /goal in Grok Build. Execute long-running tasks autonomously, with multiple rounds of subagents implementing and verifying a single goal.

340

转发到社区

Yun-Ta Tsai@yunta_tsai

2026.06.25 03:58

Getting used to being liked likely means you are overfit to RLHF. The problem with overfitting is that the pain overwhelms the limbic system once you try to sample trajectories outside the known distribution. As more people like you, your sampling regime becomes smaller and smaller to avoid negative feedback. Eventually you get stuck and become a slave to your own feelings. That’s why I have never seen a model student happy once they become a “model”. Their weights are frozen and cannot be updated anymore. They cannot risk being better than their own SOTA.

显示更多

513

转发到社区

Yun-Ta Tsai@yunta_tsai

2026.06.24 06:36

Human-to-human interaction is often bandwidth-bound instead of compute-bound. Thus, the next evolutionary jump would be direct communication in latent space, skipping the long-latency encoder-decoder loop.

显示更多

1.3K

转发到社区

Yun-Ta Tsai@yunta_tsai

2026.06.24 06:09

If your sole value is identity, then someone else will use a more extreme identity to displace you.

225

转发到社区

Yun-Ta Tsai@yunta_tsai

2026.06.24 02:03

The best signal-to-noise ratio is the product you use in your hands, not critics, marketing, or reviews.

708

转发到社区

Yun-Ta Tsai@yunta_tsai

2026.06.24 01:19

The sharpest eyes with the smartest brain. 👁️ 🤝 🧠

Tesla@Tesla

2026.06.24 01:09

In IIHS pedestrian front crash prevention tests, @Cybertruck avoided every single collision – daytime, nighttime & different angles It was also the only pickup to earn Top Safety Pick+ (highest award) in 2026

显示更多

337

转发到社区

Yun-Ta Tsai@yunta_tsai

2026.06.21 18:05

How’s your Father’s Day going? Mine is fixing a burst pipe. Very fitting.

922

转发到社区

Yun-Ta Tsai@yunta_tsai

2026.06.21 03:56

If you are visiting the US for the World Cup ⚽️, please make sure to rent a Tesla and experience Full Self-Driving.

2.2K

148

转发到社区

Yun-Ta Tsai@yunta_tsai

2026.06.20 16:05

Many people think any given ML project is 99% training. In reality, it’s 50% evaluation, 40% data cleaning, 8% integration, and 2% training. The first two set the noise floor for learning. No ML magic matters; the model cannot lower the noise floor, as that’s the optimal bound of Shannon encoding of your data. Thus, not a single day goes by without me thinking about ontology. Even the old labels have to be constantly reviewed.

显示更多

516

10.2K

1.2K

转发到社区

Yun-Ta Tsai@yunta_tsai

2026.06.17 14:54

At 7 a.m., Grok Build would wake me up and tell me what they had done last night—experiments, bug fixes, and the plans for today. Rinse and repeat.

378

转发到社区

Yun-Ta Tsai@yunta_tsai

2026.06.15 06:08

Figuring out the right eval is 100x harder than gradient descent itself.

506

转发到社区

Yun-Ta Tsai@yunta_tsai

2026.06.14 20:54

There are many good X articles lately, some superb, intellectual, and honest. They are 1000x better than opinion columns in newspapers. I noticed my chat group has more X article links than links from MSM. In the era of agentic AI, speed in reacting to truthful information is everything, or gradient descent can quickly go in another direction. A rare sign of enlightenment.

显示更多

254

3.7K

344

转发到社区

Yun-Ta Tsai@yunta_tsai

2026.06.14 05:11

Casually using Grok @imagine to one-shot sword fight scene in the bamboo forest (5 mins). Pretty good for the first try.

120

530

转发到社区

Yun-Ta Tsai@yunta_tsai

2026.06.10 05:26

Grok Build is pretty good at optimizing my code in one shot. Prompt: I want you to optimize it entirely on GPU to speed it up. Measure two metrics: the result must compare with the golden image (CPU) and be nearly identical (PSNR > 40dB), with fast pixels per second. Make a plan to a) write GPU equivalent code, b) write a benchmark suite to measure PSNR and pixels per second, c) execute various optimization strategies. Go!

显示更多

212

904

167

转发到社区

Yun-Ta Tsai@yunta_tsai

2026.06.09 15:03

Welcome to the future 🇩🇰🤗

Tesla Europe, Middle East & Africa@teslaeurope

2026.06.09 14:14

FSD Supervised now approved in Denmark 🇩🇰 Rollout will begin soon

902

转发到社区

Yun-Ta Tsai@yunta_tsai

2026.06.08 05:06

As eval is downstream of everything, it determines whether you will spend your time optimizing the right metrics. The current gap between academia and industry AI labs is the attitude toward eval. In academia, the eval set is very hard to change since a) you need to explain why your eval is better and b) you need to benchmark against your cited works with the new eval and show that your work is superior. Doing both a and b at the same time invites risky rebuttal, even if you are doing a good job on a. It is far easier to benchmark against the eval set that everyone has agreed upon. In contrast, in industry AI labs, customer feedback is your eval set and it keeps changing to cover the long tail that you could never think of during years of PhD programs. If the loss functions are not a good proxy for customer feedback, then you change them until both are aligned. Thus, academia might train students who are very good at hill climbing but inexperienced in building eval sets that capture hard real cases. To move the needle, building the right eval set matters the most.

显示更多

262

转发到社区

Yun-Ta Tsai@yunta_tsai

2026.06.07 21:05

Among many AI products, Full Self-Driving probably delivers the most consistent results—no special harness, skills, plugins, or secret prompts required.

464

转发到社区

Yun-Ta Tsai@yunta_tsai

2026.06.07 17:02

Have you spent time inspecting reasoning traces of your kids? 🙂

342

转发到社区

Yun-Ta Tsai@yunta_tsai

2026.05.24 15:02

Grok Build sub-agent swarm weekend fun. You can reuse the prompt for your projects: Read the proof of ` and come up with a few different examples with more points: a) Please understand the proof b) Come up with plan c) Orchestrate and launch sub-agent to execute the plan step-by-step d) Validate the results from the sub-agent, and correct them e) Repeat b, c, d, until you are happy with the result and its correctness DO NOT stop until the goal is reach

显示更多