Yann LeCun (@ylecun) “I started using the concept in 2016 (e.g. in my NIPS 216 keynote, in which I cal”

2026.04.30 17:00

I started using the concept in 2016 (e.g. in my NIPS 216 keynote, in which I called it a "world simulator"). I published papers on video prediction in 2016. This was meant to be a key step to train world models. Ha&Schmi appeared in 2018. The slide below is from a talk I gave at Brown in Nov 2017. Full deck here: We were hoping to train world models through video prediction. At the time, we were using generative architectures. We tried latent-variable models and GAN-style training. But never quite worked on natural video. Around 2021, I realized that predicting at the pixel level was not a good idea. That's when the JEPA concept emerged: find an abstract representation within which predictions are performed.

显示更多