@aaronburnett Truly massive gains will come in ~3 months when the entire training and inference stack is written in C/C++ and massively simplified (most software layers will be deleted completely) and we exact-map Grok to work incredibly well on a GB300