@ThiccWithaQ Jean is worthy of admiration of master Diluc, bro 🍃
Diluc Tartaglia
Albedo Kaeya
@Diluckmd45 sorry I don’t know the artist, but this is in my art collection >:3
Claude Cowork just got 10x more powerful!
Glean benchmarked centralized vs federated MCP in Claude Cowork. Same harness, same model, same queries, different context layer.
The federated approach: Each data source (Gmail, Slack, Drive, Salesforce) has its own MCP server. Claude calls each one separately. That's 5-10 tool calls per query. Each source returns results with different quality and ranking. Claude over-fetches to compensate for weak search. Then it filters and synthesizes everything with LLM reasoning. Often needs retry loops when results miss. Burns 50-80k tokens per query.
The centralized approach: All data from every source gets indexed into one unified layer. Knowledge graph connects entities across sources. Claude makes one MCP call. Gets back the top ranked results. No over-fetching, minimal filtering needed. Uses 42-44k tokens consistently.
The results: Centralized indexing preferred 2.5x more often. Federated consumed 30% more tokens on average. When federated finally got correct answers, it burned 83k tokens vs 43k for centralized.
The gap widened as tasks got more complex. Simple tasks: centralized won 66% of the time. Complex tasks: 73%.
Why centralized wins: Over-fetching doesn't just cost tokens. It dilutes the context window with noise and contradictory information. Models have finite attention. Cramming 50-100 items hoping the right ones are in there doesn't work as well as getting the right 5-10 upfront.
Federated search also loses cross-application signals. Things like document relationships, who authored what, and how content is used across the enterprise. These signals improve ranking but they only exist when data is indexed together in one layer.
The compounding problem: In multi-step tasks, each missed or incorrect retrieval compounds. By the time you reach the final output, you're working with flawed data. More tool calls and reasoning loops don't fix this. They just burn more tokens trying to recover.
You can't brute-force around bad search. More tool calls, more data fetching, more reasoning loops don't fix poor context quality. They just burn more tokens.
Why this matters: Token costs are surging. Reasoning models cost more. Companies are burning through AI budgets faster. Federated search compounds the problem.
Better search architecture beats more compute.
I've shared the link in the replies!
显示更多
10K followers on GitHub. Thank you all. It's been 13 years since I first touched open source as a college student learning to code. What started as curiosity became one of the most rewarding parts of my life. I never treated it as work. It was always something fun, something worth showing up for, day after day, commit after commit.
Over time, my six pinned projects grew into a little family. Kaku, Waza, Kami, Mole, Pake, MiaoYan. A terminal, a skill set, a typesetter, a cleaner, a wrapper, and a writing app. Different purposes, same philosophy: anything added dilutes everything else. Keep it simple, keep it useful, cut everything that doesn't matter.
A special thank you to every sponsor who put real support behind these projects. Open source runs on time and energy, and your generosity keeps it going. It means more than you know.
Open source is a culture worth keeping alive. Stay curious, keep sharing, keep talking to each other. Here's to the next 13.
显示更多
Decoupled DiLoCo is also self-healing.
We introduced artificial hardware failures during training runs. The system isolated the disruptions and continued operating, while reintegrating offline units when they came back online.
显示更多
It builds on 2️⃣ earlier advances:
Pathways: an AI system that connects different computer chips, allowing them to share data and work at their own pace.
DiLoCo: an approach to minimize the bandwidth needed across distributed centers.
Together as Decoupled DiLoCo, it can tackle the key challenge of training at scale.
显示更多
Training frontier AI models relies on identical chips staying in near-perfect synchronization. If a single chip fails, the entire training run can stall.
Decoupled DiLoCo explores how to continuously train AI models without ever stopping due to failures.
显示更多