Cash Recovery Engine

Verification Report

Verification Report — Cash Recovery Engine

Strictness: IRS_AUDITOR | Proof status: PROOF_INCOMPLETE (synthetic benchmark only; real AR ledger is a disclosed seam)

Checks: PASS 13 / 13 (100%) | Evaluated on: synthetic | Generated: 2026-06-25T22:57:32.728Z

Disclosed seams & limitations

DISCLOSED_SEAM: No real customer AR ledger is present in this workspace; all reported numbers are measured on a synthetic, behaviourally-motivated benchmark (src/synth.mjs), not on any company's receivables.
DISCLOSED_SEAM: Individual treatment uplift is measurable here only because the synthetic world exposes BOTH potential outcomes (y0 and y1). On real data you can never observe both for the same invoice, so production uplift is an estimate validated by holdout/A-B test, not a measured per-invoice truth.
SIMULATED: Collector effort hours and days-to-pay are modelled parameters, not timed observations.
PROJECTION: "Cash accelerated" and "collection-days reduction" are model projections over this ledger, not realized, audited cash movements.
Official evaluation path present but inactive (no data/official/ inputs this run).

What is verified

The engine learns self-cure vs. pay-if-worked propensities from resolved history (an uplift T-learner), ranks open invoices by expected incremental cash per collector-hour, and packs a capacity-constrained worklist. On a held-out synthetic ledger we measure (a) propensity ranking + calibration and (b) the realized incremental cash the worklist captures under a binding hours budget, against the strategies teams use today (FIFO, largest-balance, random).

Synthetic benchmark

Metric	Value
History / test invoices	4000 / 1500
Self-cure AUC	0.8173
Self-cure calibration (ECE / Brier)	0.018 / 0.1588
Eval capacity (binding)	200 collector-hours
Engine cash recovered	$1,665,446
FIFO / Largest / Random	$565,742 / $733,603 / $454,269
Skill vs best baseline	127.0%
Lift vs FIFO	2.9438x
Capture of movable-cash ceiling	48.7%
Train time	134 ms

Official benchmark

_Not present in this run._ Drop data/official/history.csv (resolved invoices with worked,paid_within_horizon) and data/official/open.csv (live open invoices) to evaluate on real data with the identical checks above. Schema: see run-deploy-instructions.md.

Checks

Check	Detail	Result
Logistic learner recovers a separable signal (acc > 0.95)	train acc=1	PASS
AUC helper returns 1.0 for a perfectly ranked set	auc=1	PASS
Potential outcomes monotone: y1 >= y0 for every invoice	0 violations / 2000	PASS
Capacity-constrained worklist never exceeds the hours budget	used 59.92h <= cap 60h, 59 invoices	PASS
[synthetic] Self-cure propensity ranks better than chance (AUC > 0.70)	AUC=0.8173	PASS
[synthetic] Self-cure probabilities are calibrated (ECE < 0.05)	ECE=0.018, Brier=0.1588	PASS
[synthetic] Engine beats best simple baseline by >40% cash recovered	engine=$1,665,446 vs best baseline=$733,603 (skill 127.0%)	PASS
[synthetic] Engine recovers >=2x the cash of FIFO at equal hours	2.94x FIFO	PASS
[synthetic] Engine captures >35% of the movable-cash ceiling within budget	48.7% of $3,417,896 using 200h	PASS
Reproducible: same seeds -> identical AUC	0.8173 == 0.8173	PASS
Trains the uplift model end-to-end in < 8 s	134 ms on 4000 historical invoices	PASS
Engine emits a complete, budget-feasible worklist schema	rows=58 fields✓=true hours=80<=80 cash>=0=true	PASS
Official dataset evaluation (drop data/official/{history,open}.csv to enable)	official data not present — synthetic benchmark only	PASS