Cash Recovery Engine

Verify

VERIFY — Cash Recovery Engine

How verification runs and what each check asserts. Source: verify.mjs. One verification structure runs against two possible benchmarks (synthetic always; official auto-detected from data/official/).

How it runs

node verify.mjs:

Runs structural checks on the learner, AUC helper, synthetic outcome

monotonicity, and optimizer feasibility.

Trains the uplift T-learner on a 4,000-invoice synthetic history (seed 7).
Scores a held-out 1,500-invoice test ledger (seed 101) the model never saw.
Measures propensity ranking, calibration, and head-to-head cash recovery

under a binding 200-hour budget.

Writes verification-report.json / .md and an evidence copy.

What each check asserts

#	Check	Asserts
1	Logistic learner recovers a separable signal	The gradient-descent learner is correct (train acc > 0.95 on a known-separable set)
2	AUC helper returns 1.0 for perfect ranking	The evaluation metric itself is correct
3	Potential outcomes monotone (y1 ≥ y0)	The synthetic causal world is well-formed; uplift is non-negative
4	Worklist never exceeds the hours budget	The 0/1-knapsack optimizer is feasible
5	Self-cure AUC > 0.70	The propensity model ranks payers above non-payers well above chance
6	Calibration ECE < 0.05	Predicted probabilities match observed frequencies (decisions can trust them)
7	Engine beats strongest baseline by > 40% cash	Uplift-ranked allocation materially out-recovers largest/FIFO/random
8	Engine ≥ 2× FIFO cash at equal hours	The lift over the most common real-world worklist is large
9	Captures > 35% of the movable-cash ceiling	The budget is spent efficiently against the theoretical maximum
10	Reproducible: same seeds → identical AUC	Determinism (bit-for-bit)
11	Trains end-to-end in < 8 s	The method is fast/practical
12	Engine emits a complete, budget-feasible worklist	The end-to-end orchestration produces the deliverable schema
13	Official path (skip or run)	Real-data evaluation is wired; informative skip when no data present

Why these checks are the right ones

The business claim is "recover more cash per collector-hour." Checks 5–6 establish the model *understands* who pays; checks 7–9 establish the optimizer *acts* on that understanding better than the conventional worklists; checks 3–4 establish the comparison is fair (well-formed world, feasible budget); checks 10–12 establish it is reproducible, fast, and emits a usable deliverable.

Thresholds

Thresholds are set conservatively below observed values so the suite is stable, not tuned to barely pass. Observed: AUC 0.817 (> 0.70), ECE 0.018 (< 0.05), skill +127% (> 40%), lift 2.94× (≥ 2×), capture 48.7% (> 35%).