注册并分享邀请链接,可获得视频播放与邀请奖励。

与「train」相关的搜索结果

train 贴吧
一个关键词就是一个贴吧,路径全站唯一。
创建贴吧
用户
未找到
包含 train 的内容
Çin’de bir Trainer’ın yasak aşkı, ölümüyle sonuçlandı. Birlikte olduğu kadının kocası, işten erken dönünce balkondan dışarıya sarkan Fitness koçu Huang Mao, binadan düştü ve olay yerinde öldü.
显示更多
0
1.1K
26.1K
1.4K
转发到社区
SpaceX has almost finished writing V1.0 of an in-house AI training stack in C that exact-maps to 220k GB300s with 800G NICs, making heavy use of pipeline parallelism and getting as close to bare metal as possible. The potential speed improvement vs JAX for large training runs is over an order of magnitude.
显示更多
0
5.1K
60.4K
5.8K
转发到社区
been thinking about this lately every app in crypto wants more of your time more clicks. more sessions. more screen time @sleepagotchi is the only one that literally wins when you put your phone down like the whole game is just. go to sleep. we'll handle the rest idk why that feels so radical but it does maybe because we've been trained to think grinding = winning and rest = falling behind but your dino doesn't care about your alpha. he just wants you to close your eyes before midnight for once in your life trying to be that guy this week. we'll see how it goes 🦖 you actually sleeping on time or just saying you will? 👇
显示更多
0
201
203
33
转发到社区
What if you trained an AI to recreate CryptoPunks? It would fail. That failure is the art. That's what @MichaelHirsch built with Slonks. Full interview in the next tweet 👇
显示更多
0
23
125
39
转发到社区
Composer 2.5 being Pareto dominant in coding per CursorBench is important. This is after only a few weeks of supplemental training and/or RL in the Colossus 2 cluster.   The 1.5 trillion parameter version of Grok will likely be a much better base model than Kimi. We shall see.
显示更多
0
39
768
56
转发到社区
I created a training pipeline to remove propaganda and gaslighting from Chinese models! I'm thrilled to announce LazarusAI's ReAligned-Qwen3.5 series of models, finetuned to reduce Chinese ideological bias and censorship, refusal behavior, and state-narrative framing I use SFT + GRPO pipeline with a dataset crafted to target the taxonomy of chinese censorship and bias, along with my ReAligned classifier model as a GRPO reward signal.
显示更多
0
16
96
9
转发到社区
Hollywood is cooked. Even Oscar winner Natalie Portman is out here doing luxury train promos like a mid-tier influencer. My kids and their friends haven’t watched a movie or TV show in years. It’s all YouTube. The entire industry is running on fumes.
显示更多
0
330
4.3K
232
转发到社区
Here is how orbital compute ties the three segments into one unstoppable system: Space: Starship gives ultra-cheap, high-cadence launch capacity to deploy massive amounts of compute hardware into orbit. Connectivity: Starlink’s laser inter-satellite links turn thousands (eventually millions) of satellites into a distributed, low-latency orbital supercomputer network with fast Earth downlink. And finally, AI segment runs and monetizes the actual compute, training and inference at unprecedented scale.
显示更多
Behind the MiMo API Price Reduction: The deepest price cut, up to 99%, is for Input (Cache Hit). The core reason is our inference framework now supports hierarchical KV cache optimization for SWA. Production inference engine tests show this optimization increases cached token capacity by 5x, equivalent to an 80% reduction in caching costs. Combined with Cache Read Overlap among multiple Full Attention modules in the Hybrid model, actual costs are further reduced. Prices for Input (Cache Miss) and Output are also reduced by 60%-80%. This mainly benefits from the extreme 1:7 Full:SWA sparsity ratio brought by the model architecture (the prefill compute of the 70-layer MiMo-V2.5-Pro roughly equals a 10-layer GQA model). This kept our original inference costs well below the industry average, naturally leaving a 2x-3x profit margin in pricing. This price adjustment simply reflects our decision to pass these structural cost efficiencies directly to developers. Operating at these newly reduced API prices, our production inference engine is running at near full capacity, and we can still essentially break even. We previously advised LLM companies not to "blindly cut prices" precisely because very few model architectures and inference optimizations can keep API costs from running at a loss. If more architectures that save compute and KV cache emerge, along with better inference Infra to drive down API costs, this will form an excellent virtuous cycle in the industry. More crucially, affordable, high-performance model APIs will drive real, sustained, and at-scale inference demand. This upstream demand pulls forward the development of the entire AI infrastructure chain—including chips, servers, optical transceivers, PCBs, liquid cooling, power, energy storage, and data centers—serving as a strategic fulcrum for a systemic revaluation of AI hardware. In the long run, this injects more affordable and accessible compute into both training and inference pipelines, accelerating the parallel evolution of global AGI across multiple regions and technical routes. For more technical details, we will release a detailed Blog post later.
显示更多
0
56
470
63
转发到社区