Cash Recovery Engine

Overview

Cash Recovery Engine

Forge Receivables Intelligence — point a fixed amount of collector time at the invoices where a human touch *actually changes the outcome*, and recover materially more cash from the same hours.

Status: PROOF_INCOMPLETE — verified end-to-end on a synthetic
accounts-receivable benchmark with a known causal ground truth. No real
customer ledger is present in this workspace, so no company-specific recovery
figure is claimed. See certification-report.md.

The problem (in one breath)

Companies carry millions in overdue receivables and have only so many collector-hours. The default playbook — work the oldest invoice or the biggest balance — burns time on accounts that would have paid anyway and on accounts no call will move. Cash that could have arrived this quarter arrives next quarter, or never.

What it does

Learns who actually needs the call. An uplift (T-learner) model trains

two propensities from resolved history — *pay if left alone* vs. *pay if worked* — and scores every open invoice by the uplift between them.

Spends the budget where it pays. A capacity-constrained optimizer packs

the collector-hours budget with the highest expected cash per hour (amount × uplift ÷ effort), a 0/1 knapsack packed greedily by ratio.

Tells the team what to do. Every invoice gets a recovery segment

(Persuadable / Self-Cure / At-Risk / Critical) and a recommended action, in a ranked worklist you can download.

Verified results (synthetic benchmark)

Measured on a held-out ledger the model never trained on (node verify.mjs, 13/13 checks):

Metric	Value
Self-cure propensity AUC	0.817
Calibration (ECE / Brier)	0.018 / 0.159
Cash recovered vs. strongest simple baseline	+127%
Cash recovered vs. FIFO at equal hours	2.94×
Capture of the movable-cash ceiling within budget	48.7%
Train time	~0.1 s (4,000 historical invoices)

These describe the model's correctness on the benchmark world, not a recovery figure for any specific company — that distinction is a disclosed seam (see proof/LIMITATIONS.md).

Quick start

node run.mjs       # train + score a demo ledger; writes the reports + tool data
node verify.mjs    # run the verification suite (writes verification-report.*)

Open public/tool.html in a browser for the live worklist (drag the capacity dial and watch the engine re-optimize).

What's in the box

src/ — the engine: deterministic RNG, synthetic ledger, feature

engineering, logistic learner, uplift T-learner, optimizer, evaluation, and orchestration.

run.mjs — demo run → reports/ deliverables + public/ledger.js snapshot.
verify.mjs — the IRS_AUDITOR verification suite.
public/tool.html — the live, dependency-free web interface.
reports/ — generated worklist CSV, JSON, and executive summary.
proof/ — the full proof package (evidence, claim lineage, limitations,

reproduction, auditor challenge).

Honest scope

The engine, the math, the optimizer, the reports, and the web tool all run live and are verified. The single seam is real data: drop a real AR export into data/official/ (schema in run-deploy-instructions.md) and the identical checks evaluate it. Until then, every number is a statement about the synthetic benchmark, clearly labelled as such.

Dependency-free Node 18+. No network. No native modules.