Cash Recovery Engine

Executive Evidence

← Back to outcome

Executive Evidence — Cash Recovery Engine

Standard: PROOF_STANDARD = IRS_AUDITOR Certification: PROOF_INCOMPLETE (machine copy: proof/PROOF_DECISION.json). Why not CERTIFIED: the customer outcome is a cash-recovery worklist for a real receivables ledger; no real ledger is present in this workspace, so that outcome is a DISCLOSED_SEAM. All implemented checks pass and are reproducible.

This document answers the ten required IRS_AUDITOR questions. Every number has lineage in proof/CLAIM_EVIDENCE.json and verification-report.json.

1. What exactly is being claimed?

  • A dependency-free receivables-prioritization engine: an uplift T-learner that

separates self-cure from pay-if-worked invoices, a capacity-constrained optimizer that maximizes expected cash per collector-hour, a four-way recovery segmentation, downloadable deliverables, and a live web interface.

  • On a held-out synthetic AR benchmark (seed-fixed): 13/13 checks pass;

self-cure AUC 0.8173; calibration ECE 0.018 (Brier 0.1588); the engine recovers $1,665,446 vs. $733,603 for the strongest simple baseline (+127.0% skill) and 2.94× FIFO at equal hours; it captures 48.7% of the $3,417,896 movable-cash ceiling within a 200-hour budget.

2. What evidence supports each claim?

  • verification-report.json (+ .md) and the raw run copy

proof/evidence/verify.log / proof/evidence/verification-report.json.

  • proof/CLAIM_EVIDENCE.json maps every claim → source, method, command,

dataset, timestamp, artifact, status.

  • proof/EXECUTION_TRACE.json records each command, exit code, and stdout

sha256; proof/ARTIFACT_MANIFEST.json + proof/CHECKSUMS.json pin every file.

3. Can an independent engineer reproduce this claim?

Yes. proof/REPRODUCE.md gives exact commands; everything is seeded and dependency-free (Node 18+). node verify.mjs reproduces the table; node tools/forge-proof-verify.mjs --outcome delivery-package/cash-recovery-engine re-checks every checksum.

4. What assumptions were made?

  • Payment is a probabilistic outcome driven by observable invoice/customer

features; a collector touch adds an uplift that is largest in the "moveable middle" (self-cure probability near 0.5) — this assumption is built into the synthetic world and is the thing the model must recover.

  • Larger balances cost disproportionately more collector effort (negotiation,

approvals, disputes, legal); the simple baselines ignore effort.

  • Potential outcomes are monotone (working an invoice never lowers its chance of

paying) — encoded via a shared threshold so individual uplift is non-negative.

  • These are modelling assumptions, not measurements from a real ledger.

5. What limitations exist?

See proof/LIMITATIONS.md (authoritative). Headline: all metrics are synthetic; no real ledger; effort/timing are modelled; "cash accelerated" is a projection.

6. What seams exist? (DISCLOSED_SEAM)

  • No real AR ledger present → no company-specific recovery figure produced.
  • Individual treatment uplift is identifiable here only because the synthetic

world exposes both potential outcomes; on real data it is an estimate.

  • Collector effort-hours and days-to-pay are modelled parameters.

7. What was actually executed?

  • node verify.mjs → 13 structural + benchmark checks on synthetic data

(deterministic). Raw output: proof/evidence/verify.log.

  • node run.mjs → trained the engine and emitted the worklist CSV, JSON, and

executive summary, plus the web-tool data snapshot.

8. What was inferred (not directly executed)?

  • Real-world recovery is inferred to be unknown — not measured. The synthetic

skill bounds the engine's ranking/optimization correctness on the modelled problem and does not transfer numerically to a specific company's books.

  • Real-data uplift quality is inferred from the control-AUC check path, not from

a live holdout/A-B test.

9. What remains unverified?

  • Any recovery metric on a real ledger (no dataset present).
  • Calibration and uplift estimates against real payment behaviour.
  • Deployment, security, integration, monitoring, and operational behaviour (not

run, not claimed).

10. What evidence would invalidate the claim?

  • A node verify.mjs run not yielding 13/13 or different numbers (drift/

environment difference).

  • Editing src/synth.mjs or any seed (synthetic numbers are conditional on them).
  • Treating synthetic skill as a real-company recovery figure (explicitly not

claimed).

  • On real data: a holdout/A-B test in which the engine's queue does not beat the

business-as-usual queue on realized cash.

Pre-written hostile objections and responses: proof/AUDITOR_OBJECTIONS.md. The generated hostile interrogation: proof/AUDITOR_CHALLENGE.md.