搜索 TTS 相关的推文与用户

2hours ago

Grok's realtime voice is now on AI Gateway. Build with AI SDK 7: • 𝚡𝚊𝚒/𝚐𝚛𝚘𝚔-𝚟𝚘𝚒𝚌𝚎-𝚝𝚑𝚒𝚗𝚔-𝚏𝚊𝚜𝚝-𝟷.𝟶 (𝚞𝚜𝚎𝚁𝚎𝚊𝚕𝚝𝚒𝚖𝚎) • 𝚡𝚊𝚒/𝚐𝚛𝚘𝚔-𝚝𝚝𝚜 (𝚐𝚎𝚗𝚎𝚛𝚊𝚝𝚎𝚂𝚙𝚎𝚎𝚌𝚑) • 𝚡𝚊𝚒/𝚐𝚛𝚘𝚔-𝚜𝚝𝚝 (𝚝𝚛𝚊𝚗𝚜𝚌𝚛𝚒𝚋𝚎)

显示更多

0

6

114

13

转发到社区

QVAC@qvac

8hours ago

QVAC SDK 0.14.0 is live. This release makes the on-device stack faster on mobile, ships the developer-agent path, and takes local text-to-speech to 31 languages. Main highlights: - OpenCode and OpenClaw. The first official OpenCode plugin, plus a maintained OpenClaw compatibility path, both built on managed mode and qvac serve. Point a coding agent at a local model with far less setup and far fewer surprises. - Brain-computer interface transcription, on the SDK. Take recorded neural signal data and decode it into text, fully on-device, no cloud. Stream it in chunks through a simple API. In 0.14 it runs GPU-accelerated on iOS. - Text to Speech in 31 languages with our Supertonic3 upgrade. VOICE AND SPEECH - Supertonic3 multilingual TTS, 5 languages to 31. - Chatterbox and Supertonic now run on the Android GPU, with lower memory use (especially on iOS), quantized s3gen Chatterbox support, and a fix for Chatterbox occasionally emitting random speech. - Whisper transcription now runs on the iOS GPU. Parakeet runs on the Android GPU, with steadier real-time streaming. VISION AND OCR - VLM multi-tile batching: high-resolution Pan and Scan images are encoded in one pass instead of tile by tile, for faster vision throughput. - OCR on ggml (EasyOCR and DocTR) reaches full speed parity with the onnx path, across Metal, OpenCL, and Vulkan. PLATFORM AND RELIABILITY - Dynamic compute backends on Linux: one build picks the right backend at runtime, and opens the door to ROCm and CUDA support without per-backend builds. - Thinking tokens are kept out of the model context, so reasoning no longer fills the KV cache. SDK 0.14.0 is now leaner and faster to start. Let’s build.

显示更多

0

1

15

1

转发到社区

Berryxia.AI@berryxia

2026.06.29 08:20

看到Product Hunt 日榜的一个产品，想到 @xiaoerzhan 小耳做的一个工具，这个软件终生版5美金。所以，其实大家的小Vibe Coding 产品还是做好营销和挖掘客户才是关键，不然酒香也怕巷子深啊。 PS：本内容仅是我做产品Demo展示，做了个skills可以输入产品内容或者链接就可以制作出营销讲解视频。 tts的音频是使用小米的模型，感觉还可以啊。

显示更多

0

1

3

0

转发到社区

Hairo (✱,✱)@haiwed3

2026.06.29 02:09

Muốn pay 200$ mà 20k với 30k viu thế này thấy anh em làm đều tay nó lên mình làm đều tay sao nó giảm nhỉ

0

40

38

1

转发到社区

Su@Sukiea1008

2026.06.26 13:03

🔥 不用剪映！！不用出镜！！你也可以制作 AI 视频啦！我整理制作了了一个「AI 图卡视频」制作 Skill 只需要你提供： 1️⃣一段文案 2️⃣一段音频，可以自己录，也可以用任意 TTS 3️⃣一个 AI Agent，比如 Codex 他就能帮你通过这个链条制作成一个视频： → HTML/CSS 图卡 → HyperFrames / GSAP 动效 → FFmpeg 合成音频 → MP4 视频你只需要把整个 GitHub 仓库链接发给 Agent，让它读取 README 和 skill/SKILL.md，然后按里面的流程帮你改图卡、跑检查、生成视频。它包含什么？ ✅ 协助你完成跑通 16:9 图卡视频MVP ✅一个无配音 demo 视频 ✅ 一个 Codex/Agent 可读的 Skill 文件 ✅ 本地检查脚本 ✅ 本地渲染脚本它不包含什么？ ❌ 不包含语音复刻 API ❌ 不包含自动写稿 ❌ 不包含自动发布它适合谁？适合想先跑通「文案 → 图卡 → 动效 → 视频」这条链路的人。 🚀尤其是有观点，但是不想自己制作视频的人。收到skill，先不要追求复杂，把最小闭环跑起来。能生成，才有得优化。然后您可以再在这个基础上不断迭代，甚至生成更长的视频。 GitHub链接我放在评论区了。

显示更多

0

6

22

4

转发到社区

🐹🐰@fanclubmillHAON

2026.06.24 13:29

555เอ็นดู ร้องเพลงเสร็จไปเปลี่ยนชุด ได้รางวัล แบบไม่ทันตั้งตัว บันใดก็สูงมาก🤟🏻 นวยเก่งมาก #milli#

显示更多

0

62

19.1K

6.7K

转发到社区

Orange AI@oran_ge

2026.06.23 05:05

声音模型的 Seedance 时刻，终于来了今天我体验到一个全新的声音模型，跟以前所有的声音模型都不一样。以前的声音模型一般叫 TTS（文本转语音），它们只能根据你给的文本来合成语音，它更像是一个朗读机器，而非智能声音模型。但这个新模型，可以根据你的想象，生成一切你所需要的声音，包括人声、音乐、音效、环境音，以及这些声音里所富含的那些不可言说的微妙细节。它的名字名字叫：豆包音频生成模型 Seed Audio 1.0。在我看来，这就是声音模型的 Seedance 时刻。就像香蕉是人类第一次将智能赋予图像，Seed Audio 是人类第一次将智能赋予声音。接下来，就让我们一起听听它到底有何特别。推特不能发音频，可转至公众号来听

显示更多

0

71

424

53

转发到社区

X Freeze@XFreeze

2026.06.18 17:19

Grok TTS is already sounding insanely human In Vapi’s blind voting Humaneness Index, Grok TTS ranked as the top AI voice model in the chart with a humaneness score of 96.....just 4 points below the real human benchmark • Top AI voice model shown • 96/100 humaneness score • Only 4 points behind the human benchmark What makes this even more impressive is that Grok TTS is combining natural-sounding speech with low latency and aggressive pricing The gap between AI-generated speech and real human voices is disappearing faster than most people realize Grok is starting to speak like a real person

显示更多

0

16

60

16

转发到社区

Kitaro_綺太郎✨💞@kitaro_cos

2026.06.18 13:01

#NIKKE# #TTStar# #ANIS# #アニス# ☀️アニスで踊ってみた☀️ 元気いっぱいのアニスが可愛すぎる…！感想やいいね、RTもらえると嬉しいです💛

显示更多

0

17

0

转发到社区

AIGCLINK@aigclink

2026.06.18 10:23

网易有道的最新TTS：Confucius4-TTS，核心零样本跨语言声音克隆能力也就是说给一段参考音，它能让同一个人开口说14种语言，并保持音色稳定、情感还原度较高从效果听跨语言没有明显口音残留感覆盖中文、英、日、韩、德、法、西、印尼、意、泰、葡、俄、马来、越南语14种语言对于新闻播报、客服、有声书等场景是够用的 #TTS# #Confucius4TTS#

显示更多

0

1

6

3

转发到社区

与「TTS」相关的搜索结果