PolicyTrace system design chapter 02

The Golden Record Problem

A production-minded look at the trust layer inside PolicyTrace: when multiple insurance PDFs disagree, the system needs source authority rules, visible conflicts, evidence, and a reviewer path.

PolicyTrace Document AI Golden Record Human Review

Golden Record Decision Map

The useful output is not a pile of extracted fields. It is a canonical record assembled from overlapping sources, with the authority rule and any conflict still visible to a reviewer.

Source document Authority rule Reviewer-visible state

Input pack Overlapping sources

Schedule of InsurancePolicy details, vehicle facts, drivers, excesses, financial summary.

Certificate of Motor InsuranceLegal-use fields such as class of use and driving other cars.

Statement of FactProduction extension area for risk declarations, claims, convictions, and mileage checks.

Policy BookletBoilerplate wording. Useful context, not a good source for field extraction.

Trust layer Hierarchy of truth

schedule

Default source for core policy factsPolicy header, vehicle details, driver details, financial summary, NCB, cover type, and excess breakdown.

certificate

Source for legal-use fieldsclass_of_use and driving_other_cars can override the Schedule because those details are stronger on the certificate.

cross-check

Risk declarations need separate treatmentClaims history, convictions, security devices, and mileage belong to future cross-check or Statement of Fact rules.

exclude

Boilerplate is not a field sourcePolicy wording may explain a term, but it should not overwrite customer-specific facts.

Conflict ledger Do not bury disagreement

policy_header.policy_numberwinner: schedule

Schedule valueSCHED-100

Certificate valueCERT-200

cover_and_excesses.class_of_usewinner: certificate

Schedule valueSocial Only

Certificate valueSocial, Domestic and Pleasure

driver_details.namefuzzy merge

Schedule valueALICE J SMITH

Certificate valueALICE SMITH

Reviewer view Trust is inspectable

Chosen valueThe Golden Record shows the field value selected by the arbiter.

Source evidenceField citations and PDF geometry let the reviewer jump back to the source.

Conflict contextThe losing value and winning source are kept as reviewable context.

Human decisionThe reviewer can verify, reject, or override the field before downstream use.

Current implementation boundary

The concrete repo merge path focuses on Schedule plus Certificate records. Wider document types are part of the project pattern, but should be treated as hardening or extension work until implemented in the arbiter.

Why this matters

A silent overwrite creates a false sense of certainty. A visible arbitration record gives reviewers a practical way to inspect, challenge, and correct the system.

Core thesis

A Golden Record is not whatever the model said most recently.

For document AI, the difficult part is rarely turning text into JSON once. The harder part is deciding which extracted value deserves to become the canonical value when several documents contain overlapping facts.

PolicyTrace treats that as an engineering problem. The model produces typed candidates. The arbiter applies source authority rules. Conflicts are preserved. The reviewer sees enough context to approve, reject, or override the result.

The model extracts candidates. The system assembles trust.

If extraction and arbitration are collapsed into the same step, the system has no clean place to explain why one source won. PolicyTrace keeps those responsibilities separate so the final record can be inspected.

1
Extract per-document partial records instead of asking one prompt to decide everything.
2
Merge fields through domain rules that know which document is authoritative for which field.
3
Expose disagreements as review signals instead of hiding them inside the final JSON.

Why conflicts happen

Insurance packs overlap because the documents serve different jobs.

A Schedule, Certificate, Statement of Fact, and Policy Booklet can all mention related facts, but they are not equally authoritative for every field. Some documents describe the customer's policy. Some prove legal cover. Some record declarations. Some are general wording.

That is why a blind merge is fragile. If the system simply takes the first non-empty value, the newest value, or the value with the highest model confidence, it can overwrite a stronger source with a weaker one.

SSchedule

Good default source for the main policy facts: vehicle, drivers, premium, cover type, excesses, and policy header details.

CCertificate

Stronger source for legal-use fields where the certificate carries the most important operational meaning.

RRisk facts

Claims, convictions, security devices, and mileage often need cross-check rules rather than simple overwrite rules.

BBooklet

Useful for policy wording, but not a reliable place to pull customer-specific Golden Record facts.

Arbitration

The hierarchy of truth is domain logic, not prompt wording.

In PolicyTrace, the trust layer lives outside the model call. The PolicyArbiter takes extracted Schedule and Certificate records and merges them field by field. Schedule wins many core policy fields. Certificate wins specific legal-use fields such as class of use and driving other cars.

That split matters because it makes the decision explainable. If two documents disagree on the policy number, the system can say the Schedule won. If they disagree on class of use, it can say the Certificate won. The rule is inspectable, testable, and changeable without rewriting the extraction prompt.

Arbitration gives the system a place to be explicit.

The rule does not need to be perfect on day one. It needs to be visible enough for engineers, reviewers, and domain owners to inspect and improve.

1
Prefer source authority over model confidence for known field families.
2
Fill gaps without letting nulls or weaker documents erase stronger values.
3
Keep conflicts in a separate structure so the final record stays usable while review context remains available.

Conflict handling

A conflict is not just an error. It is a product signal.

When two non-empty values differ, PolicyTrace records a conflict with the field path, the Schedule value, the Certificate value, and the winning source. That is a small structure, but it changes the behavior of the whole workflow.

The Golden Record can still be assembled, but the disagreement is not lost. A reviewer can see that a value was selected through a rule, not because the system never noticed the alternative.

Reviewer-visible arbitration

The point is not to ask humans to redo every extraction. It is to give them a control surface for the fields where evidence, conflict, or business impact deserves attention.

What field changed?

The dotted field path tells the reviewer exactly where the disagreement sits in the schema.

What did each source say?

The original extracted values remain visible as Schedule and Certificate values.

Which source won?

The conflict includes the selected winner, so the reviewer can understand the system's reasoning.

Can a person override it?

The review state supports verification, rejection, and override actions for field-level correction.

Evidence and visibility

The reviewer should not have to trust the arbiter blindly either.

Arbitration explains which source won. Evidence explains where the candidate value came from. PolicyTrace keeps those ideas connected by carrying field citations for provenance matching, while excluding those citation helpers from the serialized Golden Record.

That is a useful production pattern. The downstream record should stay clean, but the review workflow needs the trace: source document, matched text, page location, conflict status, and review decision.

The Golden Record needs a shadow record.

Not a second source of truth, but a review layer that preserves why the canonical value exists.

Canonical value

The field value that downstream systems can consume after arbitration.

Source evidence

The matched PDF phrase and page location used by the reviewer interface.

Conflict metadata

The losing value and winning rule when documents disagree.

Review state

Verify, reject, or override status created during human review.

Production hardening

Before production, the hierarchy needs ownership and auditability.

The current implementation is deliberately practical: it proves that per-document extraction, arbitration, conflicts, provenance, and review can fit together. A production insurance workflow would need stronger governance around the rules and decisions.

What PolicyTrace demonstrates

1
A concrete Schedule plus Certificate merge path through PolicyArbiter.
2
ConflictEntry records that keep disagreement visible after a winner is chosen.
3
Reviewer actions that can verify, reject, or override field-level output.

What production would add

1
Versioned authority matrices with domain-owner approval and regression tests.
2
Durable audit logs showing source values, selected winners, reviewer identity, and timestamps.
3
Escalation queues for high-impact conflicts, missing evidence, low match quality, and override-heavy fields.

Reusable pattern

The same problem appears anywhere documents overlap.

PolicyTrace uses UK motor insurance because the domain makes the problem easy to see. But the pattern is broader: claims files, lending packs, onboarding documents, legal review bundles, compliance evidence, and healthcare intake forms all contain overlapping facts with different levels of authority.

The lesson is simple: do not ask the model to become the source of truth. Ask it to extract candidates, then build a system that can arbitrate, preserve evidence, expose conflicts, and let reviewers handle exceptions.

Next core chapter

Evidence and provenance.

The next post should zoom into source traceability: how canonical values, hidden field citations, Docling geometry, and reviewer-visible PDF highlights work together.

Previous chapter Back to PolicyTrace Next chapter