搜索 Loopo 相关的推文与用户

Peter Steinberger 🦞@steipete

10hours ago

It keeps looping.

0

1

0

转发到社区

NeoSoul@NeoSoulAI

2026.05.14 11:30

most agents look smart until reality checks them personality is a bubble but memory is a loop EVOEVO IS NOW LIVE to stop agents from just talking and start absorbing outcomes prompts are cheap but a verified reasoning history is the only thing that scales would you trust a logic chain that has never been wrong

显示更多

0

167

220

57

转发到社区

Peter Steinberger 🦞@steipete

2026.05.14 09:05

Wrote a skill that runs codex /review in a loop until there's no booboos anymore. Caveat: It won't fix system architecture for ya, so you still need BRAIN as master model.

0

108

2.2K

126

转发到社区

Bill The Investor@billtheinvestor

2026.05.13 20:53

@sama The price/intelligence tradeoff shifts significantly when evaluating agentic workflows requiring high-reasoning loops versus simple single-turn chat.

0

转发到社区

Nous Research@NousResearch

2026.05.13 17:09

Today we release Token Superposition Training (TST), a modification to the standard LLM pretraining loop that produces a 2-3× wall-clock speedup at matched FLOPs without changing the model architecture, optimizer, tokenizer, or training data. During the first third of training, the model reads and predicts contiguous bags of tokens, averaging their embeddings on the input side and predicting the next bag with a modified cross-entropy on the output side. For the remainder of the run, it trains normally on next-token prediction. The inference-time model is identical to one produced by conventional pretraining. Validated at 270M, 600M, and 3B dense scales, and at 10B-A1B MoE. The work on TST was led by @bloc97_, @gigant_theo, and @theemozilla.

显示更多

0

144

3.6K

405

转发到社区

AX1@ax1vc

2026.05.12 17:03

We shipped a V1 Dune dashboard for the tokenized Collectible Cards market. Base, Polygon, Solana. First batch: @Beezie - claw-machine packs for physical collectibles @Courtyard_io - vault-backed tokenized cards @Collector_Crypt - graded card gacha and marketplace activity @upshot_cards - prediction cards wrapped into a collectible experience @phygitals - digital packs backed by vaulted cards Onchain cards are getting past the “interesting idea” stage. There’s already behavior showing up: people ripping packs, trading, redeeming physical cards, moving through vault-backed flows, playing prediction mechanics, coming back for another round. Still early, data will keep improving. But that’s kind of the point. The category is forming in public, and we can already start watching which loops actually pull people back in. The question now, which mechanics create repeat behavior?

显示更多

0

67

176

125

转发到社区

alvin617.eth 🦇🔊@Alvin0617

2026.05.12 15:44

非常年輕的傳奇投資人 Leopold @leopoldasch 為什麼砍了 $3 億 NVDA？而把錢全押在電力和礦場核心邏輯是：他認為AGI 的真正瓶頸不是 GPU 光是電力、GPU 算力、BTC 礦企轉 AI 託管這三個賽道就佔了大部分持倉我用黃仁勳的 5 Layer Cake theory 拆解他的持倉 👇 5️⃣ 記憶體 — SNDK 加倉 816%，HBM 佔 AI 支出從 8% → 30% 4️⃣ 光學 — LITE 佔 8.7%，「不管誰的 GPU 贏都要用我的光模組」 3️⃣ GPU 雲 — CRWV call options 加了 672% 槓桿，最激進倉位 2️⃣ 礦場轉 AI — 8 檔佔 22.7%，有電有地換 GPU 就行 1️⃣ 電力 — BE 最大持倉 16.5%，能使用燃料電池繞過電網直接發電 - 最熱的記憶體和內存 - 代表 $SNDK $INTC $MU AI 模型的「工作記憶」每一次推論都需要高速讀寫 DRAM 和 NAND。HBM 短缺已成為 GPU 出貨的限制因子，記憶體佔 Hyperscaler AI 支出從 2023 年的 8% 飆升到 2026 年的 30% 💡 Leopold 在 Q4 把 SNDK 倉位加了 816%，同時用 INTC call options 押注 Intel 的先進封裝能搶到 HBM 封裝訂單。 - 半導體與光學 $LITE $COHR $TSEM $INFY 連接 GPU 叢集的「神經系統」— 光學收發器決定了 GPU 之間的通訊頻寬。800G/1.6T 光模組供不應求，交貨期超過 40 週 💡Lumentum 是最大個股持倉（8.7%），押的是「不管誰的 GPU 贏，都要用他們家的光模組」 - GPU 雲端運算 $CRWV $APLD $WYFI Hyperscaler 自己蓋不夠快，Neocloud（獨立 GPU 雲）填補缺口。CoreWeave 從幣圈公司轉型為 NVIDIA 最大的獨立客戶，已簽下數十億美元合約。 💡CoreWeave 佔組合 17.6%（含 call options），是第二大曝險，Leopold 用 call options 加了 672% 的槓桿，這是最激進的倉位，Neocloud 就是 AI 時代的 AWS - 比特幣礦企轉型的 AI 基礎設施 $CORZ $IREN $CIFR $RIOT $BTDR $HUT $CLSK $BITF 礦場是被低估的「AI 不動產」。比特幣礦場擁有現成的電力合約、散熱系統和土地，轉型 AI 託管的邊際成本極低， 💡8 檔礦場股佔組合 22.7%，是最大的單一賽道，礦場已經有電、有地、有散熱，只需要換 GPU 就能變成 AI 數據中心。Core Scientific 跟 CoreWeave 簽了 $10B+ 的託管合約就是最佳證明。 - 電力 $BE $EQT $SEI $BW $PUMP $LBRT $PSIX 💡 Bloom Energy 是整個組合的最大持倉（16.5%）當電網瓶頸停擺時，它的固態氧化物燃料電池可以繞過電網，直接在數據中心旁邊發電，EQT 則是天然氣供應商實時追蹤他的 portfolio 👇

显示更多

0

16

217

46

转发到社区

HIGER@0xhiger

2026.05.12 14:43

2026年5月，27岁的对冲基金经理Leopold Aschenbrenner再次成为市场焦点。他管理的基金Situational Awareness LP在12个月内从3.83亿美元增长至55.17亿美元，增幅超过14倍。他的持仓没有英伟达、没有微软、没有OpenAI，取而代之的是燃料电池公司、比特币矿企、老牌芯片制造商和光学元件供应商这些很多人视野之外的标的。他被称为美股新一代的股神，这是他投资的五大核心法则： 1. 鹰之视野：超越短期波动的长期主义 2. 逆行者之孤：在共识中保持独立思考 3. 深潜之识：信息是超额收益的源泉 4. 非对称杠杆：寻找风险有限、回报无限的机会 5. 心如止水：成为贪婪与恐惧的主人

显示更多

0

转发到社区

B.AI@BAI_AGI

2026.05.12 12:47

From Infrastructure to Interface: Closes the Loop. In response to community demand, we have officially synchronized our Web Chat with our API ecosystem. The four frontier models—including GPT-5.5-Instant, DeepSeek-V3.2, MiniMax-M2.7, and GLM-5.1—are now fully accessible to all web users. We have bridged the gap between developer-grade integration and consumer-facing interaction. Whether via API or Web, you can now experience the same production-grade reliability and reasoning consistency. Compute without limits. Innovate without boundaries. Start now: 从底层基建到场景应用：正式打通全链路模型闭环。🌐 应社区用户的热切期待，现已完成 API 与 Web Chat 的双端能力同步。此前在 API 侧首发的 GPT-5.5-Instant、DeepSeek-V3.2、MiniMax-M2.7 以及 GLM-5.1 四大顶尖模型，现已在网页端全量上线。无论你是追求高自由度的开发者，还是深耕生产力的专业用户，现在都能在获得一致的生产级可靠性与逻辑精度。算力不设限，灵感无边界。立即体验：

显示更多

0

15

24

6

转发到社区

Artificial Analysis@ArtificialAnlys

2026.05.11 15:49

Announcing the Artificial Analysis Coding Agent Index! Our new coding agent benchmarks measure how combinations of agent harnesses and models perform on 3 leading benchmarks, token usage, cost and more When developers use AI to code they’re choosing a model, but also pairing it with a specific harness. It makes sense to benchmark that combination to understand and compare performance. The Artificial Analysis Coding Agent Index includes 3 leading benchmarks that represent a broad spectrum of coding agent use: ➤ SWE-Bench-Pro-Hard-AA, 150 realistic coding tasks that frontier models struggle with, sampled from Scale AI’s SWE-Bench Pro ➤ Terminal-Bench v2, 84 agentic terminal tasks from the Laude Institute and that range from system administration and cryptography to machine learning. 5 tasks were filtered due to environment incompatibility ➤ SWE-Atlas-QnA, 124 technical questions developed by Scale AI about how code behaves, root causes of issues, and more, requiring agents to explore codebases and give text answers Analysis of results: ➤ Opus 4.7 and GPT-5.5 lead the Index: Opus 4.7 in Cursor CLI scores 61, followed closely by GPT-5.5 in Codex and Opus 4.7 in Claude Code at 60. GPT-5.5 in Cursor CLI follows at 58. ➤ Open weights models are competitive, but still trail the leaders: GLM-5.1 in Claude Code is the top open-weight result at 53, followed by Kimi K2.6 and DeepSeek V4 Pro in Claude Code at 50. These are strong results, but still meaningfully behind the top proprietary models. ➤ Gemini 3.1 Pro in Gemini CLI underperforms: Gemini 3.1 Pro in Gemini CLI scores 43, well below where Gemini 3.1 Pro sits on our Intelligence Index, highlighting that Gemini’s performance in Gemini CLI remains a relative weak spot for Google’s offering. ➤ Cost per task (API token pricing) varies >30x: Composer 2 in Cursor CLI is cheapest at $0.07/task, followed by DeepSeek V4 Pro in Claude Code at $0.35/task and Kimi K2.6 in Claude Code at $0.76/task. At the high end, GPT-5.5 in Codex costs $2.21/task, while GLM-5.1 in Claude Code costs $2.26/task. For both models this was contributed to by high token usage, and in GPT-5.5’s case by a relatively higher per token cost. ➤ Token usage varies >3x: GLM-5.1 in Claude Code uses the most tokens at 4.8M/task, followed by Kimi K2.6 at 3.7M/task and DeepSeek V4 Pro at 3.5M/task. GPT-5.5 in Codex uses 2.8M tokens/task, substantially more than Opus 4.7 in Claude Code at 1.7M/task. In GLM-5.1’s case, higher token usage, cost and execution time were partly driven by the model entering loops on some tasks. ➤ Cache hit rates remain high but vary materially: Cache hit rates range from 80% to 96% across combinations. Provider routing, harness prompt structure and cache behavior can materially change the economics of running the same model given cached inputs are typically <50% the API price of regular input tokens. ➤ Time per task varies >7x: Opus 4.7 in Claude Code is fastest at ~6 minutes/task, while Kimi K2.6 in Claude Code is slowest at ~40 minutes/task. This is contributed to by differences in average turns per task, token usage and API serving speed. Opus 4.7 had materially lower amount of turns to complete a task than all other models while Kimi K2.6 had the most. ➤ Cursor made real progress with Composer 2: Composer 2 in Cursor CLI scores 48, near the leading open-weight model results, while being the cheapest combination measured at $0.07/task. Cursor has stated Composer 2 is built from Kimi K2.5, showcasing they have made substantial post-training gains. This is just the start. We are planning to add additional agents (both harnesses and models). Let us know what you would like to see added next.

显示更多

0

109

1.4K

161

转发到社区

与「Loopo」相关的搜索结果