stevibe (@stevibe) — TopicDigg

2026.05.15 17:16

BenchLocal v0.2.5 is out! > The big one: repeated test runs with majority voting (1, 3, 5, 7, or 9 runs per test). > Plus error classification, retry actions, per-scenario timings & more.

显示更多

转发到社区

stevibe@stevibe

2026.05.13 17:14

2.3x faster. Ran @UnslothAI Qwen3.6 MTP variants on a DGX Spark (UD-Q6_K_XL): > 27B → 27B MTP: 8.1 → 18.65 t/s (2.3x faster) > 35B A3B → 35B A3B MTP: 56.91 → 66.52 t/s (+17%) The 27B dense model more than doubled throughput from MTP alone. Free speed is free speed.

显示更多

234

转发到社区

stevibe@stevibe

2026.05.11 02:02

We're early in the AI boom

clem 🤗@ClementDelangue

2026.05.10 18:02

Local AI is having its moment! Below is the number of new GGUF models created each month over the past 8 months & insights from our HF internal agent (May is partial): - 176,000 total public GGUF models on HF - Two distinct regimes: Oct–Feb averaged ~5.1K new GGUF models/month. Then March–April jumped to ~9.2K/month — nearly double the previous rate. - March was the inflection point (+55% MoM) — likely driven by a wave of new open-weight model releases being quantized to GGUF. - April sustained the momentum at 9.7K, suggesting this isn't a one-off spike but a new baseline. - The GGUF ecosystem is accelerating — the community is quantizing models faster than ever, likely thanks to better tooling (llama.cpp improvements, automated quantization pipelines, and more models supporting GGUF natively). Let's go!

显示更多