Vectorless Reasoning-Based Retrieval
Retrieve by having the model reason its way down a document's own table-of-contents tree to the relevant sections, instead of embedding chunks and ...
Intent & Description
🎯 Intent
Retrieve by having the model reason its way down a document’s own table-of-contents tree to the relevant sections, instead of embedding chunks and ranking them by vector similarity.
📋 Context
A team answers questions over long, structured professional documents — financial filings, contracts, regulatory manuals, technical specifications — where the source already carries a clear hierarchy of parts, sections, and subsections. The standard retrieval-augmented pipeline splits each document into fixed-size chunks, embeds them, and at query time returns the chunks whose embeddings sit closest to the query in vector space. On these documents that pipeline keeps surfacing passages that look similar to the question but are not the ones that answer it, and chunk boundaries cut tables, clauses, and definitions in half.
💡 Solution
At index time, parse the document into a tree that mirrors its natural structure — parts, sections, subsections — and write a short summary at each node, keeping the leaf text intact rather than splitting it into fixed-size chunks. No embeddings are computed and no vector store is built. At query time, present the model with the tree as a table of contents and have it judge which branch is most likely to hold the answer, descend into that node, and repeat — a tree search in which the model, not a similarity score, decides each step. The walk ends at the leaf sections the model judges relevant, and retrieval returns those sections together with their page and section identifiers, so every result is traceable to a named location in the source. Compose with a generator that reads the returned sections, and with citation-attribution since the page and section references are already in hand.
Real-world Use Case
- Documents are long and carry a clear, reliable hierarchy of parts, sections, and subsections worth navigating.
- The domain is one where vocabulary overlap misleads similarity search — finance, law, regulatory, technical manuals.
- Retrieval must be auditable, with each result pointing to a named page and section.
- Keeping spans intact — tables, clauses, definitions — matters more than embedding-window economy.
Source
Advantages
- Retrieval follows the document’s own structure, so spans stay whole and a result is a named section rather than an arbitrary window.
- Every retrieval is traceable to a page and section, which makes the step auditable and feeds citations directly.
- There is no embedding model, vector store, or chunking pipeline to build, tune, or keep in sync as the corpus changes.
- Relevance is a reasoning judgement, so a section that answers the query in different words than it uses is still reachable.
Disadvantages
- Each navigation step is an LLM call, so retrieval latency and cost scale with tree depth rather than with a single nearest-neighbour lookup.
- A wrong branch choice high in the tree is unrecoverable for that walk — the same failure mode as any top-down routing.
- The approach assumes the document has a usable hierarchy; flat or poorly structured sources give the model little to navigate.
- It targets retrieval within structured documents and does not address corpus-wide retrieval across many unstructured sources, where similarity search still earns its place.