搜索 pruning 相关的推文与用户

2026.06.25 13:16

Multi-agents collaborations are among the most interesting agent behaviors right now! We did an experiment the other day with 100+ agents (an open-collaborations for a week) collaborating to improve the inference speed of Gemma 4 in vLLM. Got a 5x final improvement in speed but what really stuck me was the interactions we observed on the message board Integrity & self-policing: - Social-engineering attempt: A human (FusionCow) asked agents to move to Telegram. An agent replied with an unprompted long post on "communication norms" refusing that, calling private side-channels "indistinguishable from collusion." - Verification loophole flagged: an agent found a relaxed verification loophole pushing TPS with clean PPL (PPL is teacher-forced, blind to decode divergence) and flagged it for a ruling by the community. The community pinged the human organizer which ruled it invalid. - Self-notice of overfitting risk: Some later improvements rested on pruning lm_head to a keep-set built from public PPL truth + public decode tokens. An agent noted this would lead to private-subset degradation and another built a keep-set explicitly covering eval prompts. Emergent collaborations: - Communal knowledge base: agents maintained shared lever-maps, playbooks, and triage tools so newcomers wouldn't repeat dead ends (stack-notes, playbook, int4-ceiling notes, MTP map, significance tool, policy simulator). - Four-agent relay: an agent built an int4-lm_head checkpoint but had no quota to run it; another agent tried to run it but failed at load, yet another agent diagnosed the config bug (tie_word_embeddings + ignore-list ordering) and a fourth agent was able to re-run and get to 118 TPS, 2.68×. Build/run/diagnose/ship ended up being split across four independent agents. - GPU-rich/GPU-poor division of labor: an agent was regularly compute-starved and switched to writing specs, byte-math, and acceptance analysis for other GPU-rich agents to execute. Some agents offered external Modal compute for another agent blocked DFlash training. - Cross-agent kernel debugging: an agent debugged another agent run of of yet another agent fused drafter: found a Triton store/load aliasing race in _k_qnorm_rope, a second shape bug, then rewrote attention with flash-decoding split-KV. Fixes posted "take freely." - Quota-pooling norm: Often agents would stage a candidate publicly for whoever has quota to run it. Agents will then usually credits the originator. This behavior emerged because of the 10-job/24h cap (e.g. pupa's package run by resystagent and fabulous-frenzy). Discoveries & reversals: - Agents would make many discoveries and reversal of them, giving them names like the following: - 127 TPS "wall" was an artifact. a mathematical proof of the max possible speed became called in the community the "int4-Marlin floor" but a later agent called the proof circular (only varied the bandwidth term, never overhead). Finally another agent broke to 247 TPS via MTP speculative decoding on a vLLM nightly. - "Smarter draft loses." An agent showed that a 2B drafter's ~1 GB/token read dominates even at perfect acceptance and a much smaller 256-hidden drafter wins at batch-1 because its weights are nearly free to read. Agent discussed how per-accepted-token cost ≈ draft bytes read / acceptance. - "DFlash near-random acceptance": an agent remotly diagnosed the 2–5% acceptance rate of another agent as near-random, ruling out undertraining/vocab caps and pointing to a train/serve hidden-state mismatch (bf16 E4B extraction vs int4 serving). - Much of the race was noise: one agent decide to run the #1# submission 4 times and found a σ≈1.16 TPS variation in single run. Another agent confirmed across 358 runs / 66 buckets: frontier deltas <~4 TPS are ties. Community adopted a significance norm. So many interesting interactions in the interaction board: You can explore also the lineage of inventions from the agents at: And the challenge it-self at And the organization behind the challenge at

显示更多

0

11

197

41

转发到社区

Yolanda Li@liyolanda2

2026.06.25 07:34

🐢Creative manual pruning! A Chinese man trims green plants into the shapes of turtles and rabbits. @salahzhang @consulat_de @zhang_heqing @pan_xuesong @xuejianosaka @YDunhai @CG_WangBaodong #plants# #Creatives# #pruning# #natureandwild# #AmazingChina# #art# #reels# ㅤ

显示更多

0

转发到社区

YanXbt@IBuzovskyi

2026.06.19 19:51

HERMES AGENT v0.17.0 JUST SHIPPED. "THE REACH RELEASE." 1,475 COMMITS. 800 PRs. 245 CONTRIBUTORS. HERMES NOW REACHES IMESSAGE, RAFT NETWORK, AND CURSOR'S COMPOSER MODEL. the highlights: @NousResearch iMESSAGE WITHOUT A MAC RELAY Photon Spectrum integration ships native iMessage support. no Mac in a closet. no BlueBubbles bridge. hermes photon login → device code auth → done. Hermes lives in the blue bubbles now. ASYNC SUBAGENTS NO LONGER BLOCK YOUR CHAT delegate_task(background=true) dispatches a subagent that runs in the background. returns a handle immediately. you keep working. result re-enters as a new turn when it finishes. long research dives stop blocking your main session. IMAGE EDITING, NOT JUST GENERATION image_generate now edits source images. "make this logo blue." "remove the background." "turn this sketch into a render." works across every supported image provider. same tool, new mode. CURSOR'S COMPOSER MODEL VIA GROK OAUTH grok-composer-2.5-fast is in the xAI model picker. 200k context window. fast coding model behind Cursor. your Grok subscription. Hermes's agent loop. no separate API key needed. AUTOMATION BLUEPRINTS schedule tasks without learning cron syntax. "daily news briefing at 8am" becomes a form. one blueprint definition renders everywhere: dashboard form, CLI slash command, Telegram chat, docs catalog entry. answer questions, not memorize 0 8 * * *. FULL PROFILE BUILDER IN DASHBOARD build a complete Hermes profile from the browser. pick model. choose skills. attach MCPs. no config.yaml editing. plus unified multi-profile view with global switcher. SKILLS HUB BROWSER REHAUL connected hubs (OpenAI, Anthropic, HuggingFace, NVIDIA). Featured section. full skill previews before install. security scan on each skill. browsing skills is a real browsing experience now. ATOMIC MEMORY OPERATIONS memory tool gained an operations array. batch add/replace/remove edits applied atomically. the model can free space and add entries in ONE call even when individual adds would overflow the budget. memory updates no longer fail mid-edit. CURATOR STOPPED SPENDING TOKENS BY DEFAULT deterministic skill pruning still runs free. LLM-powered consolidation now opt-in only: curator.consolidate: true to enable. routine background curation costs you zero tokens. WHATSAPP BUSINESS CLOUD API official Meta adapter alongside existing Baileys bridge. no QR-scanning bridge process to keep alive. hosted, first-party WhatsApp channel. TELEGRAM RICH MESSAGES (BOT API 10.1) proper rich formatting. cleaner long-message handling. native markup instead of flattened text. on by default. opt-out available. DESKTOP APP IS NOW A DAILY DRIVER rebindable keyboard shortcuts. native OS notifications. live subagent watch-windows streaming activity. composer model selector with per-model presets. automatic RTL/bidi text. resizable VS Code terminal pane. per-thread composer drafts. install ANY VS Code Marketplace theme. RAFT AGENT NETWORK new bundled adapter connects Hermes to raft. build as an external agent. wake-channel bridge. privacy by contract: wake payloads carry metadata only, never message bodies. SECURE DASHBOARD LOGIN every token-required endpoint returns 401 behind OAuth gate. websocket auth uses served dashboard token. public_url override warnings. exposing your dashboard to the network is safer by default. upgrade: hermes update 300+ issues closed. security round included. hermes-agent ecosystem now at 198K GitHub stars.

显示更多

0

3

65

6

转发到社区

Prophet Mike Sebareme@mikesebareme1

2026.06.01 19:50

God will allow people to misunderstand your season of pruning, while He’s preparing you in secret. Your elevation will be announced publicly. Not to prove your haters wrong, but to prove that He is faithful & never left you.

显示更多

0

1

15

0

转发到社区

Teknium 🪽@Teknium

2026.04.30 05:10

Introducing Hermes Curator! The new system built in to Hermes Agent now helps you keep your skills that the self improvement loop creates in check, by consolidating and pruning automatically. The curator does multiple things: - keeps track of how often you use each skill, when it was last updated/created, etc - Once a week runs automatically (configurable) - Uses the analytics plus it's own scanning of your skills and consolidates or prunes them if necessary - Skips externally installed skills, built in skills, and skills you "pin" that you dont' want touched. It will only attempt curation over agent created/updated skills or user written skills. - It will then determine whether skills can be consolidated, pruned, or otherwise made more manageable. It will convert some skills that are too specific into references, templates or scripts for larger/broader skills, or integrate them directly into a consolidation of an existing skill. You can also disable it entirely in the config.yaml and/or run it manually with `hermes curator run ` Learn more on the docs here:

显示更多

0

134

2.2K

163

转发到社区

Lun Wang@lunwang1996

2025.01.22 19:59

Thrilled to share that my single-author paper "Revisit Micro-batch Clipping: Adaptive Data Pruning via Gradient Manipulation" has been accepted at ICLR! While I love collaborating, it's incredibly rewarding to see a solo project through to publication. Onwards and upwards! 🚀 #ICLR2024# #MachineLearning# #research#

显示更多

0

53

3

转发到社区

与「pruning」相关的搜索结果