Meet Vision Banana 🍌 from
@GoogleDeepMind! We provide strong evidence that image generators are generalist vision learners. Traditional computer vision tasks (segmentation, depth estimation, normal prediction) can now be performed at/near SOTA with a single generalist model derived from an image generation model.
🖼️ Explore the results:
📄 See details at: