PolicyTrace system design chapter 01
PolicyTrace layered architecture with evidence and review loops
A production-minded view of PolicyTrace as a reusable Document AI pattern: parse documents, protect sensitive data, bound the model call, preserve evidence, arbitrate conflicts, and keep a human review path.
Layered Architecture Map
The important story is not a straight pipeline. It is a set of responsibility boundaries around parsing, model calls, evidence artifacts, arbitration, and human review.
User / Reviewer
People inspect extracted policy facts, source evidence, conflicts, and corrections.
React Review UI
Upload flow, split-screen review, field focus, source preview, and overrides.
FastAPI / Session
Session state, extraction endpoints, PDF serving, and review updates.
Orchestration / Extraction
Docling parse, classification, specialist extraction, and schema validation.
LLM / Provider
PII-safe prompt construction, Groq and Instructor calls, and typed partial output.
Evidence / Artifacts
Uploaded PDFs, geometry, extracted values, citations, and review state.
Trust / Arbitration
Authority rules, conflict detection, Golden Record assembly, and publish gate.
Boundary legend
PII is removed before the model call.
The provider call is bounded and typed.
Arbitration happens after extraction.
Review decisions write back to state.
Current demo vs production hardening
The map shows what this repo proves without pretending it already has production controls such as auth, RBAC, durable storage, audit logging, monitoring, or deployment gates.
Architecture thesis
PolicyTrace is not a PDF-to-JSON chain. It is a reviewable decision system.
The useful architecture question is not "where does the LLM sit?" It is "where does responsibility change hands?" PolicyTrace becomes interesting when the model is treated as one bounded worker inside a larger system for privacy, evidence, arbitration, and review.
The model produces candidates. The system earns trust.
The diagram separates extraction from confidence. A model response can create a typed partial record, but the final Golden Record should be assembled after source authority, field citations, conflict handling, and reviewer approval are visible.
- 1Before the model: parse documents, classify them, and remove configured sensitive values.
- 2During the call: ask for typed partial records, not a magical final answer.
- 3After the call: preserve artifacts, map citations, arbitrate conflicts, and expose review controls.
Why the layered view matters
The architecture is strongest where it refuses to hide the uncomfortable parts.
Document AI demos often compress everything into a single arrow from PDF to JSON. That hides the failures that matter in production: private data crosses a boundary, source evidence gets lost, conflicting documents disagree, and nobody knows whether a reviewer corrected the answer.
Privacy is before the provider
PII masking is shown before the Groq/Instructor lane, so privacy is a system boundary rather than a cleanup step after extraction.
Extraction is typed work
Docling, classification, specialist prompts, and Pydantic validation each have a role. The model does not own the whole workflow.
Evidence has its own path
Uploaded PDFs, geometry, partial records, and field citations move alongside values, giving the reviewer something inspectable.
Trust is outside the model
PolicyArbiter, conflict detection, Golden Record assembly, and the review gate happen after model output exists.
How to read the map
Read each layer as a contract, not just a component group.
The diagram is intentionally layered because each row has a different ownership question. That makes the system easier to reason about, test, and harden later.
Reviewers need a surface for inspecting evidence, correcting fields, and understanding conflicts.
human review boundaryFastAPI coordinates uploads, extraction calls, document serving, and review updates.
session state, not enterprise storageDocling parse, document classification, field extraction, and schema validation narrow the task before the model is trusted.
bounded specialist workSource text, page geometry, partial records, citations, and decisions remain inspectable alongside values.
field_citations.jsonThe Golden Record is produced after authority rules and conflict handling, then passed to a reviewer-facing gate.
PolicyArbiter plus reviewImplementation honesty
Show what the repo proves, and name what production would still require.
This is the difference between a credible architecture note and a sales diagram. The current project demonstrates a pattern. It does not claim to be an enterprise document platform.
Current project proves
- 1Multi-PDF ingestion with Docling-based parsing and document classification.
- 2Configured PII masking before structured Groq/Instructor extraction calls.
- 3Pydantic partial records, PolicyArbiter rules, field citations, conflicts, and review UI.
Production hardening still needs
- 1Authentication, RBAC, tenant-aware storage, and durable sessions.
- 2Persistent artifact storage, audit logging, and retention controls.
- 3Evaluation suites, monitoring, cost controls, deployment gates, and operational runbooks.