Cash Recovery Engine

Executive Evidence

Executive Evidence — Cash Recovery Engine

Standard: PROOF_STANDARD = IRS_AUDITOR Certification: PROOF_INCOMPLETE (machine copy: proof/PROOF_DECISION.json). Why not CERTIFIED: the customer outcome is a cash-recovery worklist for a real receivables ledger; no real ledger is present in this workspace, so that outcome is a DISCLOSED_SEAM. All implemented checks pass and are reproducible.

This document answers the ten required IRS_AUDITOR questions. Every number has lineage in proof/CLAIM_EVIDENCE.json and verification-report.json.

1. What exactly is being claimed?

A dependency-free receivables-prioritization engine: an uplift T-learner that

separates self-cure from pay-if-worked invoices, a capacity-constrained optimizer that maximizes expected cash per collector-hour, a four-way recovery segmentation, downloadable deliverables, and a live web interface.

On a held-out synthetic AR benchmark (seed-fixed): 13/13 checks pass;

self-cure AUC 0.8173; calibration ECE 0.018 (Brier 0.1588); the engine recovers $1,665,446 vs. $733,603 for the strongest simple baseline (+127.0% skill) and 2.94× FIFO at equal hours; it captures 48.7% of the $3,417,896 movable-cash ceiling within a 200-hour budget.

2. What evidence supports each claim?

verification-report.json (+ .md) and the raw run copy

proof/evidence/verify.log / proof/evidence/verification-report.json.

proof/CLAIM_EVIDENCE.json maps every claim → source, method, command,

dataset, timestamp, artifact, status.

proof/EXECUTION_TRACE.json records each command, exit code, and stdout

sha256; proof/ARTIFACT_MANIFEST.json + proof/CHECKSUMS.json pin every file.

3. Can an independent engineer reproduce this claim?

Yes. proof/REPRODUCE.md gives exact commands; everything is seeded and dependency-free (Node 18+). node verify.mjs reproduces the table; node tools/forge-proof-verify.mjs --outcome delivery-package/cash-recovery-engine re-checks every checksum.

4. What assumptions were made?

Payment is a probabilistic outcome driven by observable invoice/customer

features; a collector touch adds an uplift that is largest in the "moveable middle" (self-cure probability near 0.5) — this assumption is built into the synthetic world and is the thing the model must recover.

Larger balances cost disproportionately more collector effort (negotiation,

approvals, disputes, legal); the simple baselines ignore effort.

Potential outcomes are monotone (working an invoice never lowers its chance of

paying) — encoded via a shared threshold so individual uplift is non-negative.

These are modelling assumptions, not measurements from a real ledger.

5. What limitations exist?

See proof/LIMITATIONS.md (authoritative). Headline: all metrics are synthetic; no real ledger; effort/timing are modelled; "cash accelerated" is a projection.

6. What seams exist? (`DISCLOSED_SEAM`)

No real AR ledger present → no company-specific recovery figure produced.
Individual treatment uplift is identifiable here only because the synthetic

world exposes both potential outcomes; on real data it is an estimate.

Collector effort-hours and days-to-pay are modelled parameters.

7. What was actually executed?

node verify.mjs → 13 structural + benchmark checks on synthetic data

(deterministic). Raw output: proof/evidence/verify.log.

node run.mjs → trained the engine and emitted the worklist CSV, JSON, and

executive summary, plus the web-tool data snapshot.

8. What was inferred (not directly executed)?

Real-world recovery is inferred to be unknown — not measured. The synthetic

skill bounds the engine's ranking/optimization correctness on the modelled problem and does not transfer numerically to a specific company's books.

Real-data uplift quality is inferred from the control-AUC check path, not from

a live holdout/A-B test.

9. What remains unverified?

Any recovery metric on a real ledger (no dataset present).
Calibration and uplift estimates against real payment behaviour.
Deployment, security, integration, monitoring, and operational behaviour (not

run, not claimed).

10. What evidence would invalidate the claim?

A node verify.mjs run not yielding 13/13 or different numbers (drift/

environment difference).

Editing src/synth.mjs or any seed (synthetic numbers are conditional on them).
Treating synthetic skill as a real-company recovery figure (explicitly not

claimed).

On real data: a holdout/A-B test in which the engine's queue does not beat the

business-as-usual queue on realized cash.

Pre-written hostile objections and responses: proof/AUDITOR_OBJECTIONS.md. The generated hostile interrogation: proof/AUDITOR_CHALLENGE.md.