注册并分享邀请链接,可获得视频播放与邀请奖励。

stevibe 的个人资料封面
stevibe 的头像

stevibe (@stevibe)

@stevibe
LLM. Local AI addict. Building @BenchLocalApp Builds things nobody asked for. Benchmarks things for fun.
1.3K 正在关注    21.9K 粉丝
BenchLocal v0.2.5 is out! > The big one: repeated test runs with majority voting (1, 3, 5, 7, or 9 runs per test). > Plus error classification, retry actions, per-scenario timings & more.
显示更多
2.3x faster. Ran @UnslothAI Qwen3.6 MTP variants on a DGX Spark (UD-Q6_K_XL): > 27B → 27B MTP: 8.1 → 18.65 t/s (2.3x faster) > 35B A3B → 35B A3B MTP: 56.91 → 66.52 t/s (+17%) The 27B dense model more than doubled throughput from MTP alone. Free speed is free speed.
显示更多
0
17
234
25
转发到社区
We're early in the AI boom
Local AI is having its moment! Below is the number of new GGUF models created each month over the past 8 months & insights from our HF internal agent (May is partial): - 176,000 total public GGUF models on HF - Two distinct regimes: Oct–Feb averaged ~5.1K new GGUF models/month. Then March–April jumped to ~9.2K/month — nearly double the previous rate. - March was the inflection point (+55% MoM) — likely driven by a wave of new open-weight model releases being quantized to GGUF. - April sustained the momentum at 9.7K, suggesting this isn't a one-off spike but a new baseline. - The GGUF ecosystem is accelerating — the community is quantizing models faster than ever, likely thanks to better tooling (llama.cpp improvements, automated quantization pipelines, and more models supporting GGUF natively). Let's go!
显示更多