注册并分享邀请链接,可获得视频播放与邀请奖励。

LandingAI (@LandingAI) “The hard part of RAG isn't finding the right chunk! When you chunk a document fo” — TopicDigg

LandingAI 的个人资料封面
LandingAI 的头像
LandingAI
@LandingAI
API-first Agentic Document Intelligence platform built for accuracy, reliability, and governance at scale.
加入 December 2017
860 正在关注    10K 粉丝
The hard part of RAG isn't finding the right chunk! When you chunk a document for RAG, each chunk lands in the index on its own, disconnected from the section it came from. So two chunks with identical text, but from completely different parts of the document, look exactly the same to a flat index. That becomes a problem the moment a question needs context from more than one section. The chunks come back, but how they relate to each other gets lost. ADE Section fixes this. It reads the parsed document, builds the actual hierarchy, and figures out where every chunk falls within it. That gets attached to the chunk before it's embedded. Once that's in place, a broad question can stay at the section level. A specific one can drop into a sub-chunk. You can scope a search to one part of a document instead of the whole thing. Citations get more accurate too. The model knows exactly which section a fact came from. Run ADE Parse, then run ADE Section.
显示更多