How to build *TruthGPT*? I listened to a talk by the legendary
@johnschulman2. It's densely packed with lots of deep insight. Key takeaways:
- Supervised finetuning (or behavior cloning) makes the model prone to hallucination, while RL mitigates it.
- NLP is far from done!
1/🧵