Safeguard Work-Order Agent Ecosystem

Verify

← Back to outcome

VERIFY — Work-Order Agent Ecosystem

How verification runs and what each check asserts. Source: verify.mjs.

How it runs

In external mode (DATABASE_URL + DISPATCH_GRPC_URL set):

  1. Connects to an external PostgreSQL server (TCP) and an **external Go gRPC

dispatch service** (the wire); truncates tables for a clean run.

  1. Exercises the HTTP API (POST ingest → persist → GET read-back + audit).
  2. generateCorpus({ n: 600, seed: 42 }) builds a deterministic labeled corpus.
  3. processBatch runs all four agents end-to-end — duplicate detection is a

durable DB query; dispatch crosses the gRPC wire to the Go service; all state persists in Postgres.

  1. Asserts accuracy/safety, live persistence, idempotency over the wire,

malformed rejection, and resilience (restart the Go service → reconnect; restart Postgres → durable + reconnect).

  1. Re-runs the pipeline on an independent in-process stack to prove determinism.

In fallback mode (no env) the same logic runs against in-process PGlite + a Node gRPC server, plus an on-disk close/reopen durability check.

What each check asserts

GroupCheckAsserts
HTTPPOST/GET /work-orderslive API ingests, persists, serves + audit
Accuracycategory ≥ 0.90 / priority ≥ 0.75 / region ≥ 0.99agent correctness
Safetyexception recall ≥ 0.95orders needing a human are caught
Safetyfalse-auto-action ≤ 0.02never auto-dispatch one that needed a human
Qualityexception precision ≥ 0.90clean orders rarely over-escalated
Outcomeauto-action ≥ 0.55human-in-the-loop reduced
Validatormissing-location / duplicate / over-cost handled 100%rules hold
Live gRPC (Go)Health reachable over the wire
Live DBevery order in work_orders; one audit_log row each
Live DBdispatch_records == auto-dispatch count, all with refs
Live gRPC (Go)dispatch idempotent over the wire
Live gRPC (Go)malformed DispatchRequest rejected server-side
Resilienceclient reconnects after the Go service restarts
Resiliencedata durable across an external Postgres restart + reconnect
Orchestrationdispositions reconcile to total
Performance600 orders over external gRPC + Postgres < 120 s
Determinismsame seed → identical metrics (independent stack)

What a PASS means / does not mean

A PASS means the agent logic AND the external live infrastructure (external Postgres persistence + durability, external Go gRPC transport + idempotency + reconnect, HTTP API) behave correctly on synthetic inbound data. It does not mean accuracy on real Safeguard work orders, nor that an Oracle backend, LLM classifier, security controls, or HA were verified — see proof/LIMITATIONS.md. These gaps are why the state remains CERTIFIED and not PRODUCTION_VALIDATED.