Proof Report
Certification Report — Work-Order Agent Ecosystem
Strictness: IRS_AUDITOR
The certification state, Evidence Grade, and Trust Score are assigned by the
Forge Proof Layer, not by this document. The authoritative records areproof/PROOF_DECISION.json,proof/PROOF_SCORECARD.json,proof/EVIDENCE_GRADE.md, andproof/TRUST_SCORE.json. This report only
summarizes the basis for that decision.
Proof Layer decision (authoritative)
The Proof Layer assigned the certification state, Evidence Grade, and Trust Score recorded in proof/PROOF_DECISION.json, on the basis of: 23/23 external live verification checks passing (external PostgreSQL server + Go gRPC service + HTTP API + restart/reconnect), 0 unsupported claims, all 10 IRS_AUDITOR questions answered, a complete audit trail, and a synthetic benchmark with every claim traced to evidence.
This report distinguishes four things
1. Verified live infrastructure (real, external, exercised)
- Four agents run end-to-end against a separate PostgreSQL 16 server (TCP)
and a separate Go gRPC dispatch service (the wire) — verify.mjs, 23/23.
- Live HTTP API (ingest → persist → read-back + audit).
- Live persistence: every order in
work_orders, oneaudit_logrow each,
dispatch_records matching auto-dispatches.
- gRPC idempotency over the wire; malformed requests rejected by the Go server.
- Resilience: client reconnects after the Go service restarts; data durable
across an external Postgres server restart + reconnect.
- Safety: exception recall 1.0, false-auto-action rate 0.0%. Deterministic (seed 42).
2. Synthetic-data limitations
Agent accuracy (98.87% classification, etc.) is measured on a synthetic seeded corpus with a ground-truth answer key — not real Safeguard data. No claim is made about accuracy on real work orders.
3. Remaining production seams
| Seam | Status |
|---|---|
| Inbound work-order data | synthetic seeded corpus only |
| Oracle persistence | not implemented (Postgres verified); needs an Oracle adapter |
| LLM classifier | not implemented (deterministic lexicon stand-in) |
| Security (auth/RBAC/tenant isolation/TLS/mTLS) | not implemented or tested |
| HA / multi-node / load / soak | not tested (single host) |
Full list: proof/LIMITATIONS.md.
4. What would be required for customer deployment / PRODUCTION_VALIDATED
The IRS_AUDITOR standard reserves PRODUCTION_VALIDATED for CERTIFIED plus an official benchmark, independent reproduction, and external validation — none of which exist here (the data is synthetic). To get there: a labeled real Safeguard dataset + re-measured benchmark; **independent third-party reproduction; external validation; auth/RBAC/TLS + security review**; the Oracle adapter if required; an LLM-classifier decision; and HA/load/soak testing. See proof/LIMITATIONS.md.
Evidence index
verification-report.json/.md— the 23 checks + benchmark.proof/EXECUTIVE_EVIDENCE.md— the ten IRS_AUDITOR questions answered.proof/CLAIM_EVIDENCE.json— every claim → source/method/artifact.proof/EXECUTION_TRACE.json— every command → exit code + output hash.proof/CHECKSUMS.json— sha256 of every shipped file.proof/LIMITATIONS.md/proof/AUDITOR_OBJECTIONS.md— seams + objections.
Reproduction
See proof/REPRODUCE.md. A stranger can run node verify.mjs, reproduce every number, and confirm the disclosed seams.