注册并分享邀请链接,可获得视频播放与邀请奖励。

Simon Willison (@simonw) “Microsoft's MIT licensed VibeVoice speech-to-text model (think Whisper with spea” — TopicDigg

Simon Willison 的个人资料封面
Simon Willison 的头像
Simon Willison
@simonw
Creator @datasetteproj, co-creator Django. PSF board. Hangs out with @natbat. He/Him. Mastodon: Bsky:
加入 November 2006
5.6K 正在关注    179K 粉丝
Microsoft's MIT licensed VibeVoice speech-to-text model (think Whisper with speaker diarization) is really good - my notes on running the 5.71GB 4bit MLX conversion on an M5 MacBook, using about 60GB of RAM at peak and transcribing 1hr of audio in ~9 mins
显示更多
0
38
1K
90
转发到社区