Safeguard Work-Order Agent Ecosystem

Auditor Objections

AUDITOR OBJECTIONS

Pre-written hostile objections and the evidence-backed response to each. The Proof Layer also generates AUDITOR_CHALLENGE.md independently.

Objection 1 — "Your accuracy is meaningless because the data is synthetic."

Response. Correct that it is synthetic, and we disclose it everywhere (LIMITATIONS.md, verification-report.json#/disclosedSeams). We do not claim real-data accuracy. What the synthetic corpus *does* prove is that the agent logic, confidence calibration, safety routing, and seam contracts behave correctly against a known answer key — and it is fully reproducible (seed 42).

Objection 2 — "You call them AI agents but there's no LLM."

Response. Engine v1 is a deterministic lexicon classifier, disclosed as such (specs/agent-classifier.md, seam #3). It is an *agent* in the architectural sense (autonomous classify/route/validate/action with confidence + audit). The LLM engine is a documented seam behind the same interface, not a hidden claim.

Objection 3 — "Auto-dispatching work orders is dangerous."

Response. That is exactly why the validator is conservative and why we measure the false-auto-action rate (0.0%) and exception recall (1.0) as MUST_PASS safety checks. Anything uncertain, over-budget, duplicated, or missing-location is held for a human. No order is ever silently dropped.

Objection 4 — "The gRPC/DB integration isn't real, so this doesn't work."

Response (external live). Persistence is a separate PostgreSQL 16 server reached over TCP, and dispatch crosses a real gRPC/HTTP2 wire to a separate Go service (dispatch-service/, built from proto/dispatch.proto) running in its own container. verify.mjs asserts real row counts in work_orders/audit_log/dispatch_records, idempotent replay across the wire, server-side rejection of malformed requests, the live HTTP API, client reconnect after the Go service restarts, and data durability across a full external Postgres server restart. Nothing about the transport or SQL engine is simulated.

Objection 7 — "Then why isn't this `PRODUCTION_VALIDATED`?"

Response (honest). Because the IRS_AUDITOR standard reserves that state for CERTIFIED plus an official benchmark, independent reproduction, and external validation. The inbound corpus here is synthetic, so there is no official benchmark and no third party has reproduced or validated the results on real data. The Proof Layer therefore keeps the state at CERTIFIED. The exact steps to reach PRODUCTION_VALIDATED are listed in LIMITATIONS.md.

Objection 5 — "62.6% automation isn't 'end-to-end'."

Response. "End-to-end" means every order traverses all four agents and reaches a terminal disposition — which it does (reconciliation check). 62.6% are actioned with zero human touch; the remaining 37.4% are *correctly* routed to humans because they are genuine exceptions (ambiguous, over-budget, duplicate, missing data). Reducing human work to exception handling is the stated goal.

Objection 6 — "How do I know you didn't hand-tune the numbers?"

Response. verify.mjs computes everything from the seeded corpus at run time; nothing is hard-coded. Re-run it (node verify.mjs) and the "same seed → identical metrics" check plus proof/CHECKSUMS.json confirm integrity. The Proof Layer re-runs verification itself and re-derives the claims.