搜索 Memory 相关的推文与用户

2026.06.13 02:15

Mr. President, since the U.S. has imposed export controls on Fable 5 exports to Korea, Korea should impose memory export controls on the U.S. @Jaemyung_Lee

0

61

578

43

转发到社区

Swift Language@SwiftLang

2026.06.12 20:20

Memory-safe. 13% faster on average. ⚡️ The TrueType hinting interpreter in macOS and iOS has been rewritten in Swift, replacing the original C implementation. Pixel-perfect accuracy was validated across 27 million glyphs. And the results:

显示更多

0

7

510

73

转发到社区

Mike@MikeLongTerm

2026.06.11 12:14

$AMD| The FOMO to buy @AMD Chips is NOW 🧵 Not Financial Advice! DYOR! Research Purpose Only! The Inference Queen is the biggest winner in Agentic AI where all other CPUs are struggling to compete with a 2yr old EPYC Turin and EPYC Venice is in mass production phase. AMD stresses deployability today on standard x86 platforms (no proprietary architectures required), full software compatibility, and open standards. This positions Venice + Helios as a practical, high-density alternative to competing solutions while underscoring that agentic AI shifts the balance toward CPU-rich racks alongside GPUs, and most importantly, lowering the cost of token to accelerate adoption and innovation. Context: @WSJ yesterday came out with an article that @OpenAI is condiering drasstically lowering the token prices to win more customers from Anthropic. The narrative "they" are trying to exacerbate the current AI selloff won't last long. This is a fundamental misunderstanding of what is going on, or what I already discussed for months and years. Followers and Subscribers already knew this for years, that this day would come, where token cost will bcome the central discussion among enterprises as there is no such thing as unlimited budget or Tokenmaxxing when they use $NVDA chips or In-house Hyperscalers chips. I will link various threads if you are interested in understanding the full picture from supply chain to recent TSMC Rapid 2nm expansion up to 12 Fabs total by 2027/2028. Hyperscalers and AI natives effectively have no choice but to buy more AMD system for Agentic AI as leadership in economical, power-aware, high-volume internal + agentic use. However, due to supply constraints where Supply is far behind Demand, this makes multi-vendor reality along with in-house chips drive faster industry progress, lower overall costs, and better sustainability. NVIDIA’s Vera Rubin cannot compete with a 2 years old EPYC Turin, but AMD under Dr. Lisa Su has engineered the lowest cost-per-million-tokens, highly competitive energy-efficient solutions, and superior CPU orchestration for agentic AI at scale with Helios. Dr. Su has championed this shift since at least 2023, foreseeing the rise of agentic workflows that demand far more orchestration, parallel agents, and balanced compute well before the industry fully embraced it. Her long-term vision of AI moving from simple prompts to always on, multi-agent systems has driven AMD’s investments in high-core EPYC CPUs and integrated rack-scale solutions, perfectly positioning the company for today’s realities. The OpenAI-AMD 1GW Helios deployment (starting H2 2026) represents a pivotal vertical integration move that directly supercharges the inference economics. This isn't incremental; it's a structural shift toward ownership of massive, optimized rack-scale capacity, enabling the lowest token costs and triggering the enterprise adoption flywheel. We need to be honest, $AMD is the only company that made a big bet on Inference since the day Chatgpt became sensational where $NVDA and others were betting big on Training. At the end of the day, Token bill from @AnthropicAI has to obey economics. Meaning the bills rise, companies have to get more out of it to justify the cost. It cannot be an unlimited inference budget, and it has to show up on efficiency, profitability and operating leverage. 1. Tokenomics After you understand this, you will understand why Citi cited @AnthropicAI is likely to sign a deal with $AMD along with Hyperscalers, AI Labs, Sovereign AI like Softbank 5GW in France and many other countries. However, OpenAI and $META are now wanting faster deployment, and they are AMD shareholders now, they have prioritized allocation. Anthropic and Hyperscalers just cannot compete when Helios Rack lower token cost to$0.0003–$0.0005 per million tokens at GW scale. Cost to build 1GW data center 1GW Helios Rack full build is estimated $30-$35B 1GW Rubin Rack full build is estimated $45-$55B Inference (Cost per Million Tokens) ~$NVDA B200 / HGX: ~$0.02–$0.08 on optimized workloads (FP4/MXFP4, speculative decoding). Significant improvement over Hopper but still premium-priced. GB200 NVL72 rack-scale: $0.05–$0.25+ ~$AMD Helios Racks: $0.0003-$0.0005 per M tokens, dramatically lower than NVIDIA equivalents in owned infra. MI355X node-level: Up to 40% more tokens per dollar vs. competing solutions ( B200), driven by higher memory capacity (up to 288GB+ HBM), strong bandwidth, and lower acquisition costs. Training ~$NVDA Rubin Rack is estimated $0.7-$1.2/M Tokens ~$AMD Helios Rack is estimated $0.65-$1.0/M Tokens Now, OpenAI, META and Hyperscalers can lower Inference cost even further with $AMD EPYC Venice "dense rack" or Agentic AI Rack. AMD published a detailed technical blog emphasizing that the future of agentic AI autonomous, multi-step AI systems requiring heavy orchestration, databases, caching, APIs, and control planes demands massive CPU-dense rack-scale infrastructure, not just GPUs. The catalyst prominently positions their upcoming 6th Gen EPYC "Venice" processors as the key enabler for next-generation dense racks, delivering leadership throughput under real-world power, cooling, and density constraints. ~EPYC Venice (Zen 6 architecture, up to 256 cores / 512 threads per socket) is projected to deliver exceptional rack-level performance. In AMD’s modeled 100 kW rack comparisons, Venice-powered systems are expected to achieve ~3.30x the throughput of NVIDIA’s Vera (88-core Olympus) baseline across a broad mix of agentic-supporting workloads. ~This builds on current-generation 5th Gen EPYC "Turin" (up to 192 cores), which already delivers ~2.37x rack throughput vs. Vera and ~1.6x vs. Intel’s Xeon 6980P (128 cores). ~ Liquid-cooled Turin deployments already support >27,000 CPU cores per rack today. Venice is architected to push this beyond 36,000 cores in the same rack class, dramatically increasing concurrent agent capacity and overall infrastructure efficiency. 2. Ownership vs renting compute from Hyperscalers matter to OpenAI and only owning $AMD chips can meaningfully lower token cost for enterprises. ~Eliminates cloud overhead: No provider margins, utilization buffers, or egress fees. Direct control over power contracts, cooling, scheduling, and orchestration at dedicated facilities. ~Helios optimizations at GW scale: Rack-level density (1.4+ exaFLOPS FP8 per rack), high HBM4 bandwidth, EPYC orchestration for agentic workloads, and superior TCO/TDP. AMD's long-standing focus on tokens per dollar/watt shines here 20-40%+ efficiency edges in inference-heavy scenarios. ~At 1GW+ optimized deployment, inference hits $0.0003–$0.0005 per million tokens (community/analyst models tied to Helios metrics). This is dramatically lower than typical rented/cloud equivalents, especially for high-volume output tokens in agentic flows. High token bills today, enterprises running heavy agentic/coding/analysis workloads can face $50-100M+/month at current API rates (flagship models $5-30+/M output, scaled to massive volumes). Post-Helios compression, same volume will drop to $10-15M/month (or better) via lower underlying costs passed through as pricing flexibility, volume tiers, caching, or batch discounts. ROI thresholds collapse. More companies greenlight pilots → production → massive scaling. Agentic AI (autonomous workflows) multiplies token demand exponentially, but affordability removes the friction. OpenAI gains flexibility, Unlike more cloud-dependent rivals (Anthropic), they can lower effective pricing, offer aggressive enterprise bundles, or absorb volume without margin destruction directly tackling "high token bill" complaints while maintaining profitability as usage explodes. 3. Agentic AI Models shifted CPU:GPU Ratio to 1:1 toward 3-5:1 with Explosively Token-Hungry Workloads Agentic AI (autonomous, multi-step agents with planning, tool use, iteration, and self-correction) is fundamentally more compute and token intensive than conversational or single-turn generative AI. Agentic AI. autonomous, multi-step workflows with orchestration, tool use, parallel agents, data movement, and enterprise integration has dramatically increased the importance of strong host CPUs alongside GPUs. This shifts the CPU-to-GPU ratio higher and makes balanced systems critical toward 1:1 to 5:1 as enterprises testing more than 5-10 agents. AMD EPYC Venice excels ~Leadership core density (up to 256 Zen 6 cores per socket) for running many agents in parallel, orchestration layers, and high-throughput control-plane tasks. ~Superior performance-per-core and power efficiency ( up to 2.1x higher perf/core and 2.26x better SPECpower vs. NVIDIA Grace in benchmarks). ~Tight integration in Helios: One Venice CPU + multiple MI450 GPUs per node, enabling efficient data feeding to GPUs ("zero-copy"), parallel execution, and full rack utilization for complex agentic loops. Hyperscalers (Meta, Microsoft, Amazon, Google, Softbank) and AI natives (OpenAI, Anthropic...) are adopting high-core EPYC at scale specifically for these agentic demands, as CPUs now handle a larger share of non-model work (orchestration, policy enforcement, tool calls). This complements AMD’s lower-cost GPUs for overall TCO wins. ~Agents often generate 10–100x+ more tokens per task due to iterative reasoning chains, multiple tool calls, verification loops, and long-context orchestration. ~Goldman Sachs forecasts token consumption multiplying 24x by 2030 (to 120 quadrillion tokens/month) largely driven by agentic adoption in consumer and enterprise. ~Enterprise data shows agent-pattern workloads growing at 680% annualized rates, projected to surpass conversational AI in token volume by Q3 2026. ~Daily enterprise agent token consumption is already in the billions, with complex workflows (coding, workflows, analysis) amplifying this dramatically. 4. Competitive Edge: Winning Customers from Anthropic Anthropic’s Claude models (especially Opus/Sonnet) excel in complex reasoning and agentic coding, commanding premium positioning. However, their higher underlying costs (heavier reliance on third-party cloud with margins) limit pricing flexibility compared to OpenAI’s owned Helios capacity. Anthropic is on track to generate $10.9 billion in Q2 revenue. The company expects to achieve its first-ever quarterly adjusted operating profit of $559 million. However, sustaining full-year profitability remains challenging due to immense computing and model training costs The truth is, Anthropic has no choice but to buy as much $AMD chips as possible if they want to compete with OpenAI or get investors attention. This 5% adjusted operating profit to revenue ratio is just pathetic. Current pricing dynamics (2026): OpenAI already undercuts on many tiers ( flagship output tokens significantly cheaper than equivalent Claude Opus). Nano/mini models offer 5–10x advantages for volume work. Anthropic holds edges in long-context flat pricing and certain reasoning quality. OpenAI after Helios Rack Ownership, At $0.0003–$0.0005/M effective costs, OpenAI gains massive headroom to: ~Aggressively discount high-volume agentic tiers or bundles. ~Offer “unlimited” enterprise plans or usage-based models that Anthropic struggles to match without margin erosion. ~Target cost-sensitive, high-throughput agent deployments (dev tools, automation platforms) where token bills explode. Enterprises facing $ millions in monthly agentic bills will migrate to the provider delivering better economics at scale. OpenAI’s combination of strong models (o-series reasoning) + lowest TCO positions it to erode Anthropic’s enterprise share, especially as agentic becomes the dominant token consumer. Cheaper tokens expand the total addressable market dramatically. This feeds the data/model improvement loop, justifying further capex. AMD benefits from proven scale pulling in more customers (Meta, Oracle, Microsfot, Amazon, Softbank, TensorWave, LumaAI ... already aligned on Helios). Conclusion: Dr. Lisa Su has been laser focused on inference economics since at least 2022–2023, repeatedly emphasizing that the real battleground for AI scalability would be TCO, power efficiency (TDP), and ultimately tokens per dollar and per watt not just raw training FLOPS. While many viewed inference as a secondary, commoditized workload, Dr. Su architected AMD’s roadmap around rack-scale systems optimized for high-volume, sustained inference that would dominate as models matured and usage exploded. Helios represents the culmination of that multi-year bet: a fully integrated, open platform designed precisely for the economics of massive token throughput. This deep, strategic partnership with OpenAI starting with the 1GW Helios deployment in H2 2026 and scaling to 6GW, is the embodiment of that shared vision. Both companies foresaw a future where agentic AI models evolve to become extraordinarily token-hungry: autonomous agents executing complex, iterative workflows with planning, tool use, verification loops, and long-context reasoning. These workloads can consume 100x+ more tokens per task than traditional chat or single-turn generation, driving exponential demand as capabilities improve and enterprises deploy them at scale. By owning and optimizing this massive Helios capacity at GW scale, OpenAI achieves inference costs as low as $0.0003–$0.0005 per million tokens. This structural cost advantage allows OpenAI to absorb the coming token explosion profitably, dramatically lower effective pricing for enterprises, and win high-volume agentic workloads from higher-cost competitors like Anthropic. What was once a prohibitive monthly token bill becomes an affordable accelerator for productivity and innovation. The OpenAI-AMD alliance validates Dr. Su’s prescient strategy and turns the Agentic flywheel into reality: Collapsing inference costs → explosive token consumption → richer data and better models → accelerate greater demand. This partnership doesn’t just address today’s economics, it positions both leaders at the center of the infrastructure buildout that will power AI’s next decade. By delivering the lowest inference economics at scale, OpenAI not only solves enterprise bill pain but gains a decisive weapon to win share from higher-cost rivals like Anthropic. And that is why @OpenAI and $META will deploy EPYC Dense Rack Not Financial Advice! DYOR! Research Purpose Only!

显示更多

0

1

39

9

转发到社区

X Freeze@XFreeze

2026.06.11 11:19

Grok now lets you view and manage your Memory This is a powerful feature Grok can remember useful details from your previous conversations to provide better context and more personalized assistance over time And the best part: • View what Grok remembers • Edit memories anytime • Remove memories you don’t want stored • Improve context for future conversations The result is a much more personalized AI experience that gets better the more you use it Instead of starting from scratch every conversation, Grok can build a deeper understanding of your goals, interests, projects, and preferences and help you better

显示更多

0

342

1.5K

185

转发到社区

Nancy Iskander@isknan

2026.06.10 20:16

THE HAUNTING NIGHTMARE THAT TORMENTS ME STILL! The most painful memory from the night my boys were murdered is the cold stare of Rebecca Grossman right outside the ER. I stepped out to see my friend Maggy, who’d been denied entry because of COVID — and there she was. The woman who had just killed my sons. She KNEW exactly what she’d done since they told her at the time of her arrest. She KNEW who I was. I saw it. A female officer quickly spun her around so she couldn’t keep staring at me. I went back inside and watched my 8 year old Jacob take his last breaths. That moment HAUNTS me daily. But Rebecca doesn’t even remember any of it. How can that be?? Someone tell me! she clearly remembered which doctors she knew at Los Robles to call on them to get her out. She and her husband even accuse me of making it up on this prison call. Take a listen

显示更多

0

6

26

5

转发到社区

Polo1.4 贱🕊️@xiaojianjian567

2026.06.10 03:16

POLO教你如何把Hermes长期运行起来把 Hermes Agent装到 VoyraCloud 上，一台 VPS跑出24/7长期在线的 AI 工作站 ⭐Claude Code折腾半天，跑两小时电脑休眠一关就断。Hermes Agent放本地同样问题，电脑一关任务全挂，IP偶尔波动自动化任务就崩，重启一次所有进度清零。很多搞 AI 开发的人都遇到过，想要一个真正7×24跑任务的 AI 工作站，本地机器撑不住。普通机房 VPS 是另一条路，但 IP段是数据中心的，Claude、Google、PayPal 这些平台一眼能识别，部分场景会有登录风控或环境异常提示。住宅 IP VPS 的 IP来自 ISP分配给真实家庭用户的地址段，环境更接近真实用户网络，平台接受度高，适合长期跑 AI Agent、跨境电商、海外社媒这些需要稳定环境的场景 🚀 VoyraCloud出了 Hermes官方镜像，住宅 IP VPS上一键部署，skills、memories、sessions 全保留。重启不丢进度，升级不丢数据。 30 分钟跑通第一步选 VPS。打开 Residential IP VPS页面。节点推荐华盛顿，跑 Claude Code 和 Hermes Agent 最稳。做海外社媒运营选对应目标地区节点。香港和东京节点延迟低，适合亚太业务。伦敦和法兰克福适合欧洲市场。第二步选套餐。个人测试选轻量版1c1g，9美元/月。长期跑 Hermes Agent 和 MCP选标准版2c4g，15美元/月。要 Windows桌面选 Windows VPS2c4g，12美元/月。年中钜惠~全线最高30% OFF，新购续费都能用，截止6 月30 日 💻 第三步部署 Hermes。VoyraCloud 后台选 Hermes Image镜像，VPS启动后预装环境全配好。SSH进去跑 hermes setup配 LLM provider，跑 hermes启动 TUI。这一步5 分钟，比手动跑自己会拆步骤、调工具、给结果。 Hermes Image镜像跟手动装的区别技能和记忆持久化。Hermes跑任务时自动生成 skills、积累 memory、记 session 历史。VoyraCloud镜像存在 Docker volumes 里，VPS 重启、升级、迁移都不会丢。手动装用本地 venv 加 sqlite，重启清空、版本升级偶发崩。 Telegram、Discord、Slack、WhatsApp、Signal、Email通道镜像预装好，跑 hermes gateway setup就能连。手动装要逐通道折腾 OAuth。镜像默认装 OpenRouter客户端，OpenAI、Claude、DeepSeek、Kimi、GLM、MiniMax直接切。跑 hermes model选就行，比手动配每个 provider 省时间。自进化学习循环开箱即用。Hermes12 个核心功能全部预装，子 agent并行跑任务、浏览器自动化、内置 cron调度、canvas交互、plugin扩展。VoyraCloud镜像让这些能力落地在稳定的住宅 IP上，VPS跑24 小时在线，Telegram远程盯着任务跑完看结果。哪些人最适用voyracloud？跑长期 AI自动化任务的。RSS总结、Telegram Bot、OpenClaw情报系统、定时调研报告。Hermes有内置 cron调度，VPS7x24在线，本地电脑跑这些关机断网全白费，放 VPS上一条命令就跑起来。养海外账号做跨境运营的。住宅 IP比机房 IP风控概率低，Claude API、Google、PayPal、TikTok、YouTube等平台对住宅 IP接受度普遍更高。固定 IP地区加独立服务器环境，比代理池靠谱很多。现在年中优惠Mid-Year Sale全线最高30% OFF。新购续费都能用，截止6 月30 日。注册即享，无需优惠码。新用户首单额外优惠。链接在此 POLO套餐建议：轻量版9美元/月适合个人测试。标准版15美元/月适合长期跑 Hermes Agent。Windows桌面版12美元/月适合需要 GUI操作的人。趁活动期开一台长期用，30%折扣省下来的够跑好几个月。官方账号 @VoyraCloud有更多活动和教程 🤖#VoyraCloud# #HermesAgent# #住宅IP# #ClaudeCode# #AIAgent#

显示更多

0

76

47

0

转发到社区

Milk Road AI@MilkRoadAI

2026.06.07 15:28

Micron will be a $3,000 stock within a few years and Jensen Huang just spent a week in Korea telling the world exactly why (Save this). Jensen announced four new products at the Korea event and every single one of them has memory at the center of its architecture. Vera Rubin, the next generation AI supercomputer, needs massive quantities of HBM. The new Vera CPU needs large amounts of LPDDR5. RTX Spark, the first major PC reinvention in 40 years according to Jensen, needs a lot of LPDDR5. And Nvidia's new robotics and autonomous driving platforms are being built in deep partnership with the Korean memory and electronics ecosystem. Every single growth vector for Nvidia in 2026 and 2027 runs directly through memory and Micron is the only US based company that manufactures all of it. Here is what the numbers look like right now. Fiscal Q2 2026 revenue came in at $23.86 billion, up 196% year over year, with 75% gross margins and $6.9 billion in free cash flow, a quarterly record. Management guided Q3 revenue to $33.5 billion at roughly 81% gross margins, with EPS of $19.15. These are not the numbers of a cyclical memory company but rather the numbers of a company that has been structurally repriced by the largest demand supercycle in the history of the semiconductor industry. The reason the bull case reaches $3,000 comes down to three things that have never been true at the same time in Micron's history. First, the entire 2026 HBM supply is already sold out under multi-year contracts. CEO Sanjay Mehrotra told analysts that Micron can currently only fulfill 50% to two thirds of key customers' HBM demand at any price. Second, Micron has begun volume shipment of HBM4 12-Hi specifically for Nvidia's Vera Rubin platform, the exact product Jensen was talking about in Korea and has signed its first five year strategic customer agreement, converting what was historically a quarterly negotiation business into something closer to a long-term recurring revenue model. Third, Wolfe Research's bull case model points to $160 billion in calendar year 2027 revenue and $80 in EPS. At even a 20x earnings multiple, modest for a company with this growth profile, that is a $1,600 stock. UBS has already tripled its price target to $1,625. The path to $3,000 requires HBM4 to ramp smoothly, supply constraints to persist into 2027 as Mehrotra says they will, and hyperscaler AI capex to continue growing at its current trajectory, all three of which Jensen Huang just confirmed in Seoul. The HBM total addressable market alone is projected to reach $100 billion by 2028, a forecast Micron itself already pulled forward two years ahead of schedule because demand arrived faster than anyone modeled. Micron trades at roughly 9x forward earnings today. That is cheaper than a grocery chain, for a company growing revenue at 196% year over year, with its entire production sold out, supplying the infrastructure for the most important technology buildout in history. Come join Milk Road Pro for our full breakdown of the Micron bull case how we think about the HBM4 transition timeline, what multi-year customer contracts mean for Micron's valuation multiple expansion, and our entire AI thesis. Link below!

显示更多

0

24

925

136

转发到社区

芥川Aku@fujita_aku

2026.06.07 13:20

Every memory tells a story Every dream hides a secret✨ #HonkaiStarRail# #BlackSwan# #崩壊スターレイル# #ブラックスワン#

0

11

2.2K

145

转发到社区

HankAI@hank_aibtc

2026.06.07 03:42

震撼！有开发者用Mac Mini + EXO（开源AI集群框架）真正把AI变成自己的实体生意，怎么靠它真金白银赚钱？他只在自己桌子上摞了4台Mac Mini，现在每个月稳稳坐着赚14,000美元。这个天才是什么操作的？ - 用一个叫EXO的开源框架，把4台Mac Mini连成一个本地AI集群（Cluster）。 - 成本对比直接炸裂：以前每月云GPU要烧1900美元，现在硬件一次性只花2400美元，后续每月电费才12美元左右！ - 单台Mac Mini内存根本跑不动70B+（700亿参数以上）大模型， EXO直接把多台机器的Unified Memory（苹果统一内存）池化（合并成一个大池），模型自动切片分布式跑，瞬间变身一台超级AI服务器。 - 从1台起步，第一单客户来了就加一台，慢慢堆到4台。客户爽点在于：数据全程不出房间，安全拉满，愿意为这个私有+无限用高价买单。人工智能不再是只能烧钱租云、交订阅费的游戏了，现在直接变成你用自己桌子就能搭建的基础设施生意！这是2026年真正的AI创业打开方式：低成本、可控、隐私护城河，还能持续滚雪球赚钱。存好这条，后续想搞本地AI、隐私方案或者副业的人，后面真能用上。

显示更多

0

36

123

27

转发到社区

yhking Gz@YhkingGz

2026.06.05 09:32

The dialect film "Love Letter to Grandma", which is based on the culture of the World Memory Heritage "Qiao Pi", will start in Hong Kong and Macao, Singapore, Malaysia and Brunei.

显示更多

0

1

0

转发到社区

与「Memory」相关的搜索结果