Very exciting to see the cool result! Same pattern, different physics: Codex-grown heuristics matching or beating DRL agents in fluid dynamics, while staying readable, maintainable, and transferable.
Heuristics were not dead. They were under-maintained.
Codex grew programmatic policies with no neural nets: max score on Breakout, and SOTA-level scores on MuJoCo.
Maybe heuristics were not too weak. Maybe they were just too expensive to maintain. Maybe it's the next paradigm.