🌟Introducing🎻Violin — an Open-source Video Translation Skill.
📹Video is the dominant medium on the internet, yet most high-quality content (lecture, talk, podcast) is locked behind a single language, leaving global audiences behind.
So we built Violin: a video skill that combines speech recognition, LLM translation, and speech synthesis into one seamless pipeline.
🌐 Demo:
📝 Blog:
🔗 GitHub:
✨Key Features:
🎙️High-quality multilingual ASR & Translation & TTS.
🗣️Personalize translation & voice (turn an academic talk into something children can follow).
💬Chat with the video — ask any questions grounded in the video.
🧩Support Web app, CLI, and Agent skill
🍃Fully open-source under MIT.
❤️Built with the wonderful
@ShangZhu18 and advised by
@james_y_zou !
All features powered by
@togethercompute .
Try it and let us know what you think! 🎻