Cash Recovery Engine

Verification Report

← Back to outcome

Verification Report — Cash Recovery Engine

Strictness: IRS_AUDITOR | Proof status: PROOF_INCOMPLETE (synthetic benchmark only; real AR ledger is a disclosed seam)

Checks: PASS 13 / 13 (100%) | Evaluated on: synthetic | Generated: 2026-06-25T22:57:32.728Z

Disclosed seams & limitations

  • DISCLOSED_SEAM: No real customer AR ledger is present in this workspace; all reported numbers are measured on a synthetic, behaviourally-motivated benchmark (src/synth.mjs), not on any company's receivables.
  • DISCLOSED_SEAM: Individual treatment uplift is measurable here only because the synthetic world exposes BOTH potential outcomes (y0 and y1). On real data you can never observe both for the same invoice, so production uplift is an estimate validated by holdout/A-B test, not a measured per-invoice truth.
  • SIMULATED: Collector effort hours and days-to-pay are modelled parameters, not timed observations.
  • PROJECTION: "Cash accelerated" and "collection-days reduction" are model projections over this ledger, not realized, audited cash movements.
  • Official evaluation path present but inactive (no data/official/ inputs this run).

What is verified

The engine learns self-cure vs. pay-if-worked propensities from resolved history (an uplift T-learner), ranks open invoices by expected incremental cash per collector-hour, and packs a capacity-constrained worklist. On a held-out synthetic ledger we measure (a) propensity ranking + calibration and (b) the realized incremental cash the worklist captures under a binding hours budget, against the strategies teams use today (FIFO, largest-balance, random).

Synthetic benchmark

MetricValue
History / test invoices4000 / 1500
Self-cure AUC0.8173
Self-cure calibration (ECE / Brier)0.018 / 0.1588
Eval capacity (binding)200 collector-hours
Engine cash recovered$1,665,446
FIFO / Largest / Random$565,742 / $733,603 / $454,269
Skill vs best baseline127.0%
Lift vs FIFO2.9438x
Capture of movable-cash ceiling48.7%
Train time134 ms

Official benchmark

_Not present in this run._ Drop data/official/history.csv (resolved invoices with worked,paid_within_horizon) and data/official/open.csv (live open invoices) to evaluate on real data with the identical checks above. Schema: see run-deploy-instructions.md.

Checks

CheckDetailResult
Logistic learner recovers a separable signal (acc > 0.95)train acc=1PASS
AUC helper returns 1.0 for a perfectly ranked setauc=1PASS
Potential outcomes monotone: y1 >= y0 for every invoice0 violations / 2000PASS
Capacity-constrained worklist never exceeds the hours budgetused 59.92h <= cap 60h, 59 invoicesPASS
[synthetic] Self-cure propensity ranks better than chance (AUC > 0.70)AUC=0.8173PASS
[synthetic] Self-cure probabilities are calibrated (ECE < 0.05)ECE=0.018, Brier=0.1588PASS
[synthetic] Engine beats best simple baseline by >40% cash recoveredengine=$1,665,446 vs best baseline=$733,603 (skill 127.0%)PASS
[synthetic] Engine recovers >=2x the cash of FIFO at equal hours2.94x FIFOPASS
[synthetic] Engine captures >35% of the movable-cash ceiling within budget48.7% of $3,417,896 using 200hPASS
Reproducible: same seeds -> identical AUC0.8173 == 0.8173PASS
Trains the uplift model end-to-end in < 8 s134 ms on 4000 historical invoicesPASS
Engine emits a complete, budget-feasible worklist schemarows=58 fields✓=true hours=80<=80 cash>=0=truePASS
Official dataset evaluation (drop data/official/{history,open}.csv to enable)official data not present — synthetic benchmark onlyPASS