搜索 Orchestra 相关的推文与用户

2026.06.11 12:14

$AMD| The FOMO to buy @AMD Chips is NOW 🧵 Not Financial Advice! DYOR! Research Purpose Only! The Inference Queen is the biggest winner in Agentic AI where all other CPUs are struggling to compete with a 2yr old EPYC Turin and EPYC Venice is in mass production phase. AMD stresses deployability today on standard x86 platforms (no proprietary architectures required), full software compatibility, and open standards. This positions Venice + Helios as a practical, high-density alternative to competing solutions while underscoring that agentic AI shifts the balance toward CPU-rich racks alongside GPUs, and most importantly, lowering the cost of token to accelerate adoption and innovation. Context: @WSJ yesterday came out with an article that @OpenAI is condiering drasstically lowering the token prices to win more customers from Anthropic. The narrative "they" are trying to exacerbate the current AI selloff won't last long. This is a fundamental misunderstanding of what is going on, or what I already discussed for months and years. Followers and Subscribers already knew this for years, that this day would come, where token cost will bcome the central discussion among enterprises as there is no such thing as unlimited budget or Tokenmaxxing when they use $NVDA chips or In-house Hyperscalers chips. I will link various threads if you are interested in understanding the full picture from supply chain to recent TSMC Rapid 2nm expansion up to 12 Fabs total by 2027/2028. Hyperscalers and AI natives effectively have no choice but to buy more AMD system for Agentic AI as leadership in economical, power-aware, high-volume internal + agentic use. However, due to supply constraints where Supply is far behind Demand, this makes multi-vendor reality along with in-house chips drive faster industry progress, lower overall costs, and better sustainability. NVIDIA’s Vera Rubin cannot compete with a 2 years old EPYC Turin, but AMD under Dr. Lisa Su has engineered the lowest cost-per-million-tokens, highly competitive energy-efficient solutions, and superior CPU orchestration for agentic AI at scale with Helios. Dr. Su has championed this shift since at least 2023, foreseeing the rise of agentic workflows that demand far more orchestration, parallel agents, and balanced compute well before the industry fully embraced it. Her long-term vision of AI moving from simple prompts to always on, multi-agent systems has driven AMD’s investments in high-core EPYC CPUs and integrated rack-scale solutions, perfectly positioning the company for today’s realities. The OpenAI-AMD 1GW Helios deployment (starting H2 2026) represents a pivotal vertical integration move that directly supercharges the inference economics. This isn't incremental; it's a structural shift toward ownership of massive, optimized rack-scale capacity, enabling the lowest token costs and triggering the enterprise adoption flywheel. We need to be honest, $AMD is the only company that made a big bet on Inference since the day Chatgpt became sensational where $NVDA and others were betting big on Training. At the end of the day, Token bill from @AnthropicAI has to obey economics. Meaning the bills rise, companies have to get more out of it to justify the cost. It cannot be an unlimited inference budget, and it has to show up on efficiency, profitability and operating leverage. 1. Tokenomics After you understand this, you will understand why Citi cited @AnthropicAI is likely to sign a deal with $AMD along with Hyperscalers, AI Labs, Sovereign AI like Softbank 5GW in France and many other countries. However, OpenAI and $META are now wanting faster deployment, and they are AMD shareholders now, they have prioritized allocation. Anthropic and Hyperscalers just cannot compete when Helios Rack lower token cost to$0.0003–$0.0005 per million tokens at GW scale. Cost to build 1GW data center 1GW Helios Rack full build is estimated $30-$35B 1GW Rubin Rack full build is estimated $45-$55B Inference (Cost per Million Tokens) ~$NVDA B200 / HGX: ~$0.02–$0.08 on optimized workloads (FP4/MXFP4, speculative decoding). Significant improvement over Hopper but still premium-priced. GB200 NVL72 rack-scale: $0.05–$0.25+ ~$AMD Helios Racks: $0.0003-$0.0005 per M tokens, dramatically lower than NVIDIA equivalents in owned infra. MI355X node-level: Up to 40% more tokens per dollar vs. competing solutions ( B200), driven by higher memory capacity (up to 288GB+ HBM), strong bandwidth, and lower acquisition costs. Training ~$NVDA Rubin Rack is estimated $0.7-$1.2/M Tokens ~$AMD Helios Rack is estimated $0.65-$1.0/M Tokens Now, OpenAI, META and Hyperscalers can lower Inference cost even further with $AMD EPYC Venice "dense rack" or Agentic AI Rack. AMD published a detailed technical blog emphasizing that the future of agentic AI autonomous, multi-step AI systems requiring heavy orchestration, databases, caching, APIs, and control planes demands massive CPU-dense rack-scale infrastructure, not just GPUs. The catalyst prominently positions their upcoming 6th Gen EPYC "Venice" processors as the key enabler for next-generation dense racks, delivering leadership throughput under real-world power, cooling, and density constraints. ~EPYC Venice (Zen 6 architecture, up to 256 cores / 512 threads per socket) is projected to deliver exceptional rack-level performance. In AMD’s modeled 100 kW rack comparisons, Venice-powered systems are expected to achieve ~3.30x the throughput of NVIDIA’s Vera (88-core Olympus) baseline across a broad mix of agentic-supporting workloads. ~This builds on current-generation 5th Gen EPYC "Turin" (up to 192 cores), which already delivers ~2.37x rack throughput vs. Vera and ~1.6x vs. Intel’s Xeon 6980P (128 cores). ~ Liquid-cooled Turin deployments already support >27,000 CPU cores per rack today. Venice is architected to push this beyond 36,000 cores in the same rack class, dramatically increasing concurrent agent capacity and overall infrastructure efficiency. 2. Ownership vs renting compute from Hyperscalers matter to OpenAI and only owning $AMD chips can meaningfully lower token cost for enterprises. ~Eliminates cloud overhead: No provider margins, utilization buffers, or egress fees. Direct control over power contracts, cooling, scheduling, and orchestration at dedicated facilities. ~Helios optimizations at GW scale: Rack-level density (1.4+ exaFLOPS FP8 per rack), high HBM4 bandwidth, EPYC orchestration for agentic workloads, and superior TCO/TDP. AMD's long-standing focus on tokens per dollar/watt shines here 20-40%+ efficiency edges in inference-heavy scenarios. ~At 1GW+ optimized deployment, inference hits $0.0003–$0.0005 per million tokens (community/analyst models tied to Helios metrics). This is dramatically lower than typical rented/cloud equivalents, especially for high-volume output tokens in agentic flows. High token bills today, enterprises running heavy agentic/coding/analysis workloads can face $50-100M+/month at current API rates (flagship models $5-30+/M output, scaled to massive volumes). Post-Helios compression, same volume will drop to $10-15M/month (or better) via lower underlying costs passed through as pricing flexibility, volume tiers, caching, or batch discounts. ROI thresholds collapse. More companies greenlight pilots → production → massive scaling. Agentic AI (autonomous workflows) multiplies token demand exponentially, but affordability removes the friction. OpenAI gains flexibility, Unlike more cloud-dependent rivals (Anthropic), they can lower effective pricing, offer aggressive enterprise bundles, or absorb volume without margin destruction directly tackling "high token bill" complaints while maintaining profitability as usage explodes. 3. Agentic AI Models shifted CPU:GPU Ratio to 1:1 toward 3-5:1 with Explosively Token-Hungry Workloads Agentic AI (autonomous, multi-step agents with planning, tool use, iteration, and self-correction) is fundamentally more compute and token intensive than conversational or single-turn generative AI. Agentic AI. autonomous, multi-step workflows with orchestration, tool use, parallel agents, data movement, and enterprise integration has dramatically increased the importance of strong host CPUs alongside GPUs. This shifts the CPU-to-GPU ratio higher and makes balanced systems critical toward 1:1 to 5:1 as enterprises testing more than 5-10 agents. AMD EPYC Venice excels ~Leadership core density (up to 256 Zen 6 cores per socket) for running many agents in parallel, orchestration layers, and high-throughput control-plane tasks. ~Superior performance-per-core and power efficiency ( up to 2.1x higher perf/core and 2.26x better SPECpower vs. NVIDIA Grace in benchmarks). ~Tight integration in Helios: One Venice CPU + multiple MI450 GPUs per node, enabling efficient data feeding to GPUs ("zero-copy"), parallel execution, and full rack utilization for complex agentic loops. Hyperscalers (Meta, Microsoft, Amazon, Google, Softbank) and AI natives (OpenAI, Anthropic...) are adopting high-core EPYC at scale specifically for these agentic demands, as CPUs now handle a larger share of non-model work (orchestration, policy enforcement, tool calls). This complements AMD’s lower-cost GPUs for overall TCO wins. ~Agents often generate 10–100x+ more tokens per task due to iterative reasoning chains, multiple tool calls, verification loops, and long-context orchestration. ~Goldman Sachs forecasts token consumption multiplying 24x by 2030 (to 120 quadrillion tokens/month) largely driven by agentic adoption in consumer and enterprise. ~Enterprise data shows agent-pattern workloads growing at 680% annualized rates, projected to surpass conversational AI in token volume by Q3 2026. ~Daily enterprise agent token consumption is already in the billions, with complex workflows (coding, workflows, analysis) amplifying this dramatically. 4. Competitive Edge: Winning Customers from Anthropic Anthropic’s Claude models (especially Opus/Sonnet) excel in complex reasoning and agentic coding, commanding premium positioning. However, their higher underlying costs (heavier reliance on third-party cloud with margins) limit pricing flexibility compared to OpenAI’s owned Helios capacity. Anthropic is on track to generate $10.9 billion in Q2 revenue. The company expects to achieve its first-ever quarterly adjusted operating profit of $559 million. However, sustaining full-year profitability remains challenging due to immense computing and model training costs The truth is, Anthropic has no choice but to buy as much $AMD chips as possible if they want to compete with OpenAI or get investors attention. This 5% adjusted operating profit to revenue ratio is just pathetic. Current pricing dynamics (2026): OpenAI already undercuts on many tiers ( flagship output tokens significantly cheaper than equivalent Claude Opus). Nano/mini models offer 5–10x advantages for volume work. Anthropic holds edges in long-context flat pricing and certain reasoning quality. OpenAI after Helios Rack Ownership, At $0.0003–$0.0005/M effective costs, OpenAI gains massive headroom to: ~Aggressively discount high-volume agentic tiers or bundles. ~Offer “unlimited” enterprise plans or usage-based models that Anthropic struggles to match without margin erosion. ~Target cost-sensitive, high-throughput agent deployments (dev tools, automation platforms) where token bills explode. Enterprises facing $ millions in monthly agentic bills will migrate to the provider delivering better economics at scale. OpenAI’s combination of strong models (o-series reasoning) + lowest TCO positions it to erode Anthropic’s enterprise share, especially as agentic becomes the dominant token consumer. Cheaper tokens expand the total addressable market dramatically. This feeds the data/model improvement loop, justifying further capex. AMD benefits from proven scale pulling in more customers (Meta, Oracle, Microsfot, Amazon, Softbank, TensorWave, LumaAI ... already aligned on Helios). Conclusion: Dr. Lisa Su has been laser focused on inference economics since at least 2022–2023, repeatedly emphasizing that the real battleground for AI scalability would be TCO, power efficiency (TDP), and ultimately tokens per dollar and per watt not just raw training FLOPS. While many viewed inference as a secondary, commoditized workload, Dr. Su architected AMD’s roadmap around rack-scale systems optimized for high-volume, sustained inference that would dominate as models matured and usage exploded. Helios represents the culmination of that multi-year bet: a fully integrated, open platform designed precisely for the economics of massive token throughput. This deep, strategic partnership with OpenAI starting with the 1GW Helios deployment in H2 2026 and scaling to 6GW, is the embodiment of that shared vision. Both companies foresaw a future where agentic AI models evolve to become extraordinarily token-hungry: autonomous agents executing complex, iterative workflows with planning, tool use, verification loops, and long-context reasoning. These workloads can consume 100x+ more tokens per task than traditional chat or single-turn generation, driving exponential demand as capabilities improve and enterprises deploy them at scale. By owning and optimizing this massive Helios capacity at GW scale, OpenAI achieves inference costs as low as $0.0003–$0.0005 per million tokens. This structural cost advantage allows OpenAI to absorb the coming token explosion profitably, dramatically lower effective pricing for enterprises, and win high-volume agentic workloads from higher-cost competitors like Anthropic. What was once a prohibitive monthly token bill becomes an affordable accelerator for productivity and innovation. The OpenAI-AMD alliance validates Dr. Su’s prescient strategy and turns the Agentic flywheel into reality: Collapsing inference costs → explosive token consumption → richer data and better models → accelerate greater demand. This partnership doesn’t just address today’s economics, it positions both leaders at the center of the infrastructure buildout that will power AI’s next decade. By delivering the lowest inference economics at scale, OpenAI not only solves enterprise bill pain but gains a decisive weapon to win share from higher-cost rivals like Anthropic. And that is why @OpenAI and $META will deploy EPYC Dense Rack Not Financial Advice! DYOR! Research Purpose Only!

显示更多

0

1

39

9

转发到社区

Polo1.4 贱🕊️@xiaojianjian567

2026.06.10 09:48

POLO问一问大家，Fable5 50 美元你觉得贵不贵？ Anthropic 6 月 9 号把 Claude Fable5 推上线， Mythos5 共用底层，戴了安全护栏才敢公开发布。社区都在说贵贵贵。50 美元/M 输出，刷一刷X满屏吐槽。我敢说吐槽贵的人里，10 个有 9 个没真跑过 Fable5。贵不贵这个判断，应该建立在它到底能干多少事的基础上，不是建立在每 M 50 美元这个数字上。Fable5 主打的卖点是任务越长越复杂，领先越明显，不是短回答更聪明。这个定位决定了，它值不值的问题，跟跑多长的任务强相关 ⭐ 我让 Hermes Agent 跑了几轮真任务，印象最深的是 code review。一坨 2000 行的 Python 单体文件，丢给 Sonnet 4.5 跟 Fable5 同一个 prompt 看差异。Sonnet 在第 12 轮反馈开始漏上下文，提的建议开始重复前面说过的点。Fable5 跑到第 18 轮还能精准引用文件第 487 行的一个 import，告诉我这个 import 在新的拆分方案里应该挪到哪里去。这种差距不是"聪明 5%"的差距，是能不能用的差距。第二个测的是多 agent 串行。让三个 subagent 接力：第一个写 API 文档，第二个基于文档写 SDK，第三个基于 SDK 写示例代码。Sonnet 链在第二轮就出现指令漂移，第三个 subagent 完全不知道第二个为什么这么设计。Fable5 链全程稳，三个 subagent 之间引用关系清晰，最后产物不需返工。第三个是文档综述。给 30 篇技术博客，让 AI 提炼成一篇 5000 字的综述，对引用源做交叉验证。Fable5 的引用准确率明显高于 Sonnet，幻觉少一截。这三个 case 都是长任务加真实场景，刚好打在 Fable5 的主场。短问答、单步工具调用这种，Fable5 没比 Sonnet 强多少，甚至偶尔还慢一点。所以50 美元贵不贵的答案不是非黑即白，跑短任务，确实贵得离谱；跑长任务，省心就是省钱 🎨 Fable5 直接走 Anthropic 官方 API 用得起的人不多，ZenMux 接了限时充值返赠通道，一个账号能调 200+ 模型，不限 RPM 不限流。我没充值之前是担心被割韭菜，看完文档逻辑是真聚合不是套壳才决定试试。 Fable5 这代模型，真正变的是 AI Agent 的稳定性边界，不是单点能力。把它当 Sonnet 用，50 美元贵得你想骂人。把它当 orchestrator 用，让它跑 4 小时不掉链子，50 美元等于给团队请了一个不用交社保的高级工程师。链接在此想上手跑评测可以看下 ZenMux 现在 PAYG 限时返赠，充 20 美元送 10 美元、充 50 美元送 30 美元，一周内有效。一个账号跑全平台 200+ 模型，不限 RPM 不限流，跑 Fable5 长任务刚好合适。我个人订阅的是他们 Builder 计划，20 美元月固定预算，能调 ClaudeCode / Codex / OpenClaw 这一票 Coding Agent 也能用，AI 保险赔付是加分项，真出问题不会全损。充值活动的链接 #ClaudeFable5# #HermesAgent# #AIAgent#

显示更多

0

55

30

0

转发到社区

Angelo Giuliano 🇨🇭🇮🇹@angeloinchina

2026.06.10 08:58

US Puppet Marcos Stages Constitutional Coup to Crush Duterte Nationalists Three weeks after staged gunfire in the Philippine Senate, a fake constitutional crisis is exploding. Its goal: lock in US puppet Ferdinand “Bongbong” Marcos Jr. as unchallenged strongman while destroying Sara Duterte. The Senate is split under two rival leaderships who call each other fake. Marcos — son of the old dictator, controlled by Washington via his family’s frozen billions — is grabbing all the power. He rules by emergency orders, forces changes, and plans special sessions for his agenda. This is no random mess. It is open war between Marcos, America’s eager servant building bases against China, and the Duterte camp’s fight for true independence and friendly ties with Beijing. The Dutertes refuse to make the Philippines America’s unsinkable aircraft carrier. So Marcos unleashes raw lawfare: fake plunder charges and repeated impeachment attacks on Sara Duterte, plus the sham International Criminal Court “trial” of Rodrigo Duterte — a US-controlled political weapon against leaders who defy Washington. May 11 revealed the script. The House impeached Sara Duterte again. Pro-Duterte forces pushed back. Then came the staged shooting for chaos and cover. Pure theater. In June, the Marcos camp seized control, changed rules to weaken Duterte allies before Sara’s July impeachment trial, and called it legal. It is a clear coup. The real goal is total control for Washington: emergency powers, war budgets, and constitutional changes to allow more foreign bases, military sites, missiles, and troops — turning the Philippines into the Ukraine of Southeast Asia. Peace and Chinese development are sacrificed for war risk. Classic imperial playbook: install the weak dynasty, destroy nationalists with lawfare and crisis, then rewrite the Constitution for permanent foreign control. Marcos calls it “rule of law.” Real patriots see it as a slow US-orchestrated coup to kill the Duterte legacy and turn the country into a launchpad against China. The Supreme Court is the last defense. Not sure it will hold

显示更多

0

4

69

25

转发到社区

熊沢世莉奈@kuma_Seri

2026.06.06 11:31

#シャインポスト# Orchestra LIVE🎻🎼 ご来場ありがとうございました！見に来てくださった方、オーケストラ生演奏、綺麗なお衣装、全てが素敵な時間でした❤️ また宝物が増えちゃったなー🥰 あっという間すぎてまだまだライブしたい！もっと歌いたい気持ちでいっぱいです🌹 風祭朝陽役 /熊沢世莉奈

显示更多

0

18

862

183

转发到社区

熊沢世莉奈@kuma_Seri

2026.06.06 02:47

#シャインポスト# Orchestra LIVE🎻🎼 来て下さる皆さん一緒に楽しみましょうね！当日券もございます！私は準備万端❤️🖤

0

10

511

97

转发到社区

KONAMI コナミ公式@KONAMI573ch

2026.06.05 00:04

おはようございます！ 6月5日、金曜日です。「シャインポスト Orchestra LIVE」が6月6日（土）に東京・蒲田の大田区民ホール・アプリコ大ホールで開催されます。キャストの歌声とオーケストラによるアレンジはなかなか聴く機会が無いと思います。今のところ、昼の部・夜の部ともにチケットがあるようです。こちらのサイトからチェックしてみてください➡ それでは今日も1日よろしくお願いします！

显示更多

0

6

328

131

转发到社区

Amazon Web Services@awscloud

2026.06.04 18:50

1/4 OpenAI. Anthropic. Uber. Lyft. They're all running their AI & ML pipelines on the same orchestration tool: Apache Airflow.

0

72

1.1K

76

转发到社区

NVIDIA AI Infrastructure@NVIDIAAIInfra

2026.06.04 18:07

"Agentic AI changes the role of the CPU. The CPU is now the conductor and the GPU is the orchestra". 🎼 NVIDIA Vera is the first CPU built for AI agents — purpose-built from the ground up for how AI works today. Faster. More efficient. Ready for what's next. 🔲 #NVIDIAGTC#

显示更多

0

11

254

32

转发到社区

Milk Road AI@MilkRoadAI

2026.06.04 15:20

The narrative that AI will wipe out enterprise SaaS overnight is one of the most misunderstood ideas circulating in markets right now, and the evidence does not support it (Save this). @DavidSacks made this case directly and the logic is worth working through carefully. Salesforce is a system of record debugged by millions of customer support tickets over twenty five years, stress tested across thousands of enterprise deployments and deeply embedded into revenue operations at the largest companies on earth. The idea that a CFO will replace that with probabilistically generated code from an AI assistant without compliance guarantees, integration depth, audit trails, and enterprise support infrastructure is not how these decisions actually get made. The market has been pricing in the existential version of this risk anyway and the results have been extreme. Over $1 trillion in SaaS market cap was erased in the first week of February 2026 alone. Global SaaS spending is still projected to grow from $318 billion in 2025 to $512 billion in 2028 which is not the trajectory of a category being killed. The operating reality is entirely disconnected from the stock price narrative. ServiceNow beat earnings nine consecutive quarters in a row and its stock crashed 11% on the same day. Salesforce raised its full year forecast to $41.5 billion on record results and the stock still fell. Sacks makes an important distinction between survivability risk and value capture risk. The survivability risk, enterprises ripping out Salesforce for AI generated software is largely overstated. The SaaS products genuinely at risk are narrow ones charging high prices for underused features with no proprietary data and low switching costs. The value capture risk is real and it is the more sophisticated threat. AI orchestration layers like Claude CoWork are being designed to sit above all of these tools pulling data from Salesforce, ServiceNow, and Snowflake simultaneously and owning the user's primary workspace in the process. If enterprise users move from living inside Salesforce to living inside an AI agent that calls into those systems on their behalf, the SaaS platforms do not disappear but rather become infrastructure. The expansion revenue, the premium pricing power and the next decade of value creation all migrate to whoever owns that orchestration layer.

显示更多

0

6

29

7

转发到社区

NVIDIA@nvidia

2026.06.03 16:29

Agents need more than a model. Jensen Huang breaks down the enterprise agent stack: models, orchestration, tools with skills, and a secure runtime to hold it all together. This is the NVIDIA toolkit for agents:

显示更多

0

39

463

60

转发到社区

与「Orchestra」相关的搜索结果