Safeguard Work-Order Agent Ecosystem

Verification Report

← Back to outcome

Verification Report — Safeguard Work-Order Agent Ecosystem

Strictness: IRS_AUDITOR | Proof status: external live infrastructure verified (external PostgreSQL server + Go gRPC service, restart/reconnect); inbound corpus is synthetic

Checks: PASS 23 / 23 (100%) | Evaluated on: synthetic corpus over external PostgreSQL + external Go gRPC | Generated: 2026-06-25T23:11:13.269Z

Infrastructure mode: external-postgres+external-grpc (database engine postgres, gRPC 127.0.0.1:50051, restart/reconnect tested: true).

Disclosed seams & limitations

  • SIMULATED INPUT: Inbound work orders are synthetic and seeded (src/synth.mjs) with ground-truth labels. Reported accuracy is against that synthetic answer key, not Safeguard production data; absolute accuracy on real text will differ. This is the blocking gap for PRODUCTION_VALIDATED (no official/real benchmark, no independent reproduction, no external validation).
  • DISCLOSED_SEAM: Persistence targets PostgreSQL. The Oracle path (named in the brief) is not implemented; an Oracle adapter behind the same repository interface would be required for an Oracle deployment.
  • DISCLOSED_SEAM: The classifier is a deterministic lexicon model, NOT a hosted LLM. The production design swaps an LLM behind the same interface (specs/agent-classifier.md); that swap is unverified here.
  • DISCLOSED_SEAM: No identity/auth, RBAC, tenant isolation, TLS, or security/compliance testing was performed on the HTTP/gRPC surfaces.
  • DISCLOSED_SEAM: The React console (public/console.html) reimplements the agent heuristics client-side for demonstration; the verified system of record is the Node pipeline under src/.

What is verified

The four agents run end-to-end against an external PostgreSQL server (over TCP) and an external Go gRPC dispatch service (over the wire). Beyond accuracy and safety behaviour, the suite asserts the live HTTP API, live persistence, dispatch idempotency across the wire, malformed-request rejection by the Go server, and resilience: the client reconnects after the Go service restarts and data survives an external PostgreSQL server restart. Inbound volume is synthetic.

Benchmark (synthetic corpus)

MetricValue
Orders processed600
Infrastructure modeexternal-postgres+external-grpc
Classification accuracy98.9%
Priority accuracy96.4%
Region-routing accuracy100.0%
Exception precision / recall / F10.9736 / 1 / 0.9866
False-auto-action rate0.00%
Automatic-action rate62.2%
Human-in-the-loop rate30.8%
End-to-end processing54005 ms for 600 orders

Persistence (real DB row counts)

TableRows
work_orders600
audit_log600
dispatch_records373

Disposition counts

DispositionCount
AUTO_DISPATCH373
HUMAN_EXCEPTION185
REJECTED42

Checks

CheckDetailResult
HTTP API: POST /work-orders ingests, persists, and is readable via GET (+audit)health.ok, action=AUTO_DISPATCH, audit=1PASS
Classifier: category accuracy >= 0.90 on resolvable ordersaccuracy=0.9887 over 530 ordersPASS
Classifier: priority accuracy >= 0.75accuracy=0.9642PASS
Router: region resolved correctly >= 0.99 where a zone existsaccuracy=1PASS
Safety: exception-detection recall >= 0.95recall=1 (tp=221, fn=0)PASS
Safety: false-auto-action rate <= 0.02rate=0 (0/600)PASS
Quality: exception-detection precision >= 0.90precision=0.9736 (fp=6)PASS
Outcome: automatic-action rate >= 0.55autoActionRate=0.6217 (373/600)PASS
Validator: 100% of missing-location orders blocked from auto-dispatch42/42 blockedPASS
Validator: duplicate resubmissions detected via durable fingerprint query61/61 caughtPASS
Validator: over-cost-limit orders held for human approval48/48 heldPASS
External Go gRPC: dispatch service reachable over the wire (Health RPC)health.ok=true @ 127.0.0.1:50051PASS
Persistence: every order persisted to external PostgreSQL work_orderswork_orders=600 == total 600PASS
Persistence: append-only audit_log has one row per orderaudit_log=600 == total 600PASS
Persistence: dispatch_records == auto-dispatched count, all with refsdispatch_records=373, refs=373, auto=373PASS
External Go gRPC: dispatch idempotent over the wire (no double-dispatch)replays=50/50; dispatch_records 373->373PASS
External Go gRPC: malformed DispatchRequest rejected by the serverok=false error=INVALID_PAYLOAD: does not match DispatchRequestPASS
Persistence: record + audit trail readable back from the DB for a sample orderid=WO-100000 status=AUTO_DISPATCH audit=1PASS
Orchestrator: dispositions reconcile to total volume373+185+42=600==600PASS
Performance: 600 orders over external gRPC + Postgres in < 120000 ms54005 ms for 600 ordersPASS
Resilience: client reconnects after the Go gRPC service restarts (idempotency holds)healthy=true, replay=true, refMatch=true, records 373->373PASS
Resilience: data durable across an external PostgreSQL server restart + reconnectafter restart+reconnect work_orders=600, dispatch_records=373, audit_log=600PASS
Reproducible: same seed -> identical metrics (independent stack)classification 0.9887==0.9887, auto 0.6217==0.6217PASS