DOCPACER / EVIDENCE COGNITION INFRASTRUCTURE 2026.06 / DOC.001

§01 · Overview

Read every
document.
Cite every
answer.

Find, compare, and explain what matters across your company's documents, with a citation behind every answer.

02

§02 · Problem

Some questions only get
answered when someone
reads every document.

"Where is the document?" is solved. The questions that take time (comparing terms across thousands of contracts, mapping policy drift across subsidiaries, finding outliers in a corpus) require reading the population, not finding the right one.

Today those questions take legal, audit, finance, and compliance teams weeks of manual review. DocPacer produces a cited table in an afternoon.

An example, at population scale

"Across our 5,000 customer agreements, where do liability caps and audit rights diverge from our group standard?"

Today Three weeks. A spreadsheet. No way to verify. Nothing audit-ready.
DocPacer A cited comparison table. Every row links back to the exact clause. Auditor-ready by default.
03

§03 · Method

A pipeline,
not a chatbot.

DocPacer ingests your document corpus, extracts structured claims from every clause, models them into a queryable graph, runs comparison across the whole population, and links every value back to the exact clause it came from.

  1. 01

    Ingest

    PDFs, DOCX, agreements, policies. Any format, any scale, any language.

  2. 02

    Parse

    Extract clauses, parties, dates, amounts, obligations from every document.

  3. 03

    Model

    Structure findings into a queryable document graph. Every value typed and tagged.

  4. 04

    Analyse

    Compare and detect divergence across the whole population. Side-by-side, not one-at-a-time.

  5. 05

    Cite

    Every output links back to the exact clause in the source. The receipt comes with the answer.

04

§04 · Exhibit A

A specimen
divergence
register.

What the cited output looks like when the question is run against 5,247 customer agreements. Every cell in the citation column links back to the exact clause in the source document. This is the deliverable. Not a chat answer, not a summary.

Exhibit A · Divergence Register 5,247 agreements · Question set: Liability & Audit Rights

> Where do liability caps and audit rights diverge from our group standard across all customer agreements?

Agreement Liability cap Audit rights Status Citation
Acme Corp MSA 2023 €500K (12× monthly) Annual · 30-day notice §12.3 · §18.1
FinGroup GmbH SLA €50K (capped) None specified §9.2
NordicTech Enterprise €2M (uncapped) Quarterly · 14-day notice §11.1 · §14.4
Meridian Holdings €500K (12× monthly) Annual · 30-day notice §10.2 · §17.3
EastBridge Corp €0 (mutual waiver) None §8.5
05

§05 · Principles

Three words,
picked carefully.

DocPacer compares entire document populations and cites every answer back to the source clause.

01

Populations

Not a document. Not a folder.

The whole corpus. Thousands of contracts, hundreds of policies, every agreement and obligation across every business unit. Most tools give you one document at a time. A context window is not a corpus. We work at company scale.

02

Compares

Not search. Not summarisation.

Side-by-side divergence detection across the whole set. "Where do these 5,000 contracts disagree with our standard?" is a fundamentally different question from "find me the contract with X." The first is the question that takes a team weeks today.

03

Citations

Not implied. Not paraphrased.

Every cell links to the exact clause in the source document. Usable in audit, litigation, and board reporting. The answer to "how do I know it's right?" is a clickable citation.

06

§06 · Foundation

Not the expert.
The foundation.

DocPacer doesn't compete with the tools experts use. It is the layer underneath them — reading every document once, persisting structure, letting every expert work against the corpus instead of one file at a time.

01

Chat-with-your-PDF tools

For one document. DocPacer reads them all.

Chat-with-PDF tools live above us, against a single document at a time. They work for one document or twenty. DocPacer reads the whole population so the same kind of question — "what does this say" — is answerable at corpus scale, with a citation behind every value.

02

Enterprise search

Finds the document. DocPacer reads what's in it.

Search and DocPacer answer different halves of the question — "where is it" and "what does the corpus say." Search sits on top of the index. DocPacer builds the structure the index is pointing into.

03

CLM platforms

New contracts going forward. DocPacer makes the back catalogue legible.

CLM platforms depend on metadata tagged at signing. The 80% of contracts already in your archive are unstructured to them. DocPacer extracts structure retroactively from the corpus they never indexed.

04

Legal-AI point tools

Built by lawyers, for lawyers. DocPacer is built for every expert.

Legal-AI tools live inside one lawyer's workflow. DocPacer is the foundation underneath. The same engine reads the corpus whether the expert on top is legal, finance, audit, procurement, or engineering — each domain brings its own kernels and asks its own questions.

05

RAG wrappers

Query-time retrieval. DocPacer does the work upstream.

RAG retrieves plausible chunks at query time. DocPacer does the heavy upstream work — extraction in bounded passes, kernel authoring, citation fidelity, graph persistence — that any retrieval layer needs to land at corpus scale. RAG can live on top of DocPacer. It cannot replace what DocPacer does underneath.

07

§07 · Use cases

Five audiences.
One pattern.

Different roles, different vocabulary, the same kind of question. One that takes weeks today because it requires reading every document. Pick the one that sounds like your week. If yours isn't here, the pattern still applies.

Item 01 Legal Customer agreements at scale
“Across our 5,000 customer agreements, where do liability caps, audit rights, and renewal terms diverge from our group standard?”

Today

A junior lawyer with a spreadsheet, six weeks, and a result the General Counsel does not fully trust.

With DocPacer

A cited table by Friday. Every row links to the exact clause. The work shifts from reading to deciding.

Item 02 Finance Where financial exposure actually sits
“Across our customer and vendor contracts, where do we carry unlimited liability, off-pattern indemnities, or financial commitments that don't match our group risk appetite?”

Today

An audit-and-rebuild project that ends in a board memo six months later, often after a near-miss has already happened.

With DocPacer

A cited inventory of every clause that creates outsized financial exposure, refreshed whenever you re-run the question — no full re-ingestion. Risk visible before it's realised.

Item 03 Audit & Compliance Control evidence across the population
“Across our DPAs, vendor contracts, and subsidiary policies, where do our control claims diverge from the evidence in the documents themselves?”

Today

Sample-pulling, partial reviews, and an ISAE/SOC cycle that depends on the auditor not asking the wrong follow-up.

With DocPacer

A cited control-evidence map across the full population. Gaps between policy and practice surfaced in the same view, every finding traceable to a source clause.

Item 04 Engineering & Product Drift across specs and commitments
“Across our specifications, runbooks, vendor SLAs, and customer commitments, where do our promises conflict, overlap, or contradict each other?”

Today

Tribal knowledge plus a wiki nobody reads, plus a customer escalation that surfaces the conflict the wrong way.

With DocPacer

A cited consistency map across every spec and contract. Drift detected, not discovered.

Item 05 Individual & team "I just inherited 200 documents"
“What's standard, what's an outlier, and what should I read first?”

Today

Read for a week and hope. Or pick five at random and call it a sample.

With DocPacer

A population summary in an afternoon. Outliers flagged, citations included. You read what matters and skip what doesn't.

08

§08 · Pricing

Priced on the corpus,
not the seat.

The value lives in the document population we analyse and the depth of analysis we run, not in how many people open a tab.

Not Per-employee SaaS pricing. The value lives in the corpus and the depth of analysis, not in how many people have a tab open.
But Priced on the corpus you analyse and the depth of analysis you run. Annual platform access, document population packs, and pooled analysis credits. Use it daily, weekly, or for a one-off question.
01 Annual platform

Platform + Corpus

Annual platform fee, document population packs, pooled analysis credits.

  • Annual platform licence
  • Document population packs (by corpus size)
  • Pooled analysis credits
  • Small number of analyst seats (full analysis access)
  • Unlimited viewer seats (read cited results)
02 Design-partner pilot

Start here

One corpus. One question your team cannot answer in a week today. Four to six weeks. Up to 50% credited toward the annual contract if you continue.

  • Single corpus, single question
  • 4–6 week engagement
  • Full cited output, auditor-ready
  • Up to 50% credited to annual contract
  • Direct access to founding team
  • Shapes the roadmap for your use case
09

§09 · Roadmap · Private beta

DocPacer is real.
Shipping in pilots.

Running today against real document populations with design partners. Not GA yet. The full roadmap is its own page, with what's actively building and where it's heading.

Now

Live in design-partner corpora today

  • Ingest any corpus.

    PDF, DOCX, PPTX, XLSX. Long documents, mixed formats, multi-language. The same pipeline whether the population is a hundred contracts or ten thousand.

  • Cited extraction on every claim.

    Every value links to the exact clause it came from. Nothing implied, nothing paraphrased.

  • Cross-document comparison and divergence.

    Side-by-side analysis at corpus scale, not one document at a time. The output is a cited table, not a chat answer.

  • Queryable through MCP.

    Connect Claude, Claude Code, or any MCP-aware client to your corpus and ask questions that span every document. The analyst web UI ships in Next.

[ See the full roadmap ]

10

§10 · Pilot

Pick one corpus.
Pick one question.

Take the customer or vendor agreements from one business unit. Ask where liability, renewal, audit, and termination clauses diverge from your group standard. If we come back with a cited comparison table in an afternoon, you have your pilot.

What we need from you

  • · Read access to one corpus, typically 1,000 to 50,000 documents.
  • · One sharp question your team cannot answer in a week today.
  • · One stakeholder who can verify outputs against ground truth.

What you get back

  • · A cited comparison table or divergence register, every cell traceable.
  • · Working MCP access to query the corpus from Claude or any MCP client.
  • · A method document covering kernels, validation passes, and limitations.
  • · A go or no-go view on rolling DocPacer across the rest of your estate.

Commercial shape

  • · Four to six weeks from contract to cited output.
  • · Pricing scoped per engagement.
  • · Up to 50% of the pilot fee credits toward the annual contract if you continue.
  • · Direct access to the founding team throughout.

4–6

Weeks to a cited result

Pricing scoped per engagement. Enquire for a quote.

[ Enquire about a pilot ]
11

§11 · Questions

The honest
answers.

The questions we get most often in pilot conversations, answered without spin.

01 How is this different from RAG?

RAG retrieves the most similar chunks at query time and asks a model to summarise. DocPacer extracts structured claims up front, persists them as a typed graph, and runs the model in bounded passes with clause-level validation. RAG is good at "find me X." DocPacer is good at "tell me everything that diverges across the whole population." Different shape of question, different architecture.

02 How do you handle hallucination?

Every output is anchored to a verbatim quote from a specific clause in a specific document. If the model cannot ground a claim in source text, it returns "not found" rather than a confident wrong answer. Kernels include validation passes that flag low-confidence extractions for human review.

03 What about data security and residency?

EU-hosted by default. We do not train on customer data. Identity, access governance, and audit-trail tooling deepen through 2026 as pilots formalise their requirements — talk to us about the specifics you need.

04 What corpus sizes have you tested?

Live against design-partner corpora today. The architecture handles populations several orders of magnitude larger than what is currently in production — that scale is not a research bet, it is how the pipeline is built. We will validate at full scale during 2026 against real customer corpora.

05 What if our question changes after we onboard?

Add a new kernel. The corpus is already ingested and graphed. New kernels run against existing nodes without full re-ingestion. New questions return in hours, not weeks.

06 Multi-language?

Yes. The pipeline handles mixed-language corpora. Design partners are running Danish, English, German, Norwegian, and Swedish in the same population.

07 What does "kernel" actually mean?

A kernel is a curated extraction routine that says: for this category of document, extract these specific data points, validate them against these rules, and flag uncertainty. Kernels are how we get from "the model said something plausible" to "the model said something we can stand behind in front of an auditor."

08 Who is behind DocPacer?

A small founding team with a deep technical bet on document graphs as the right substrate for population-scale analysis. Pilots include direct access to the founders for the duration. For a longer conversation about the technology and the roadmap, write to morten@docpacer.com.