Executive Evidence
Executive Evidence — SAR Multi-Crop Acreage Estimator
Standard: PROOF_STANDARD = IRS_AUDITOR Certification: PROOF_INCOMPLETE (machine copy: proof/PROOF_SCORECARD.json). Why not CERTIFIED: the customer outcome is an official competition leaderboard submission; the official dataset is unavailable in this workspace, so that outcome is a DISCLOSED_SEAM. All implemented checks pass and are reproducible.
This document answers the ten required IRS_AUDITOR questions. Every number has lineage in proof/CLAIM_EVIDENCE.json and verification-report.json.
1. What exactly is being claimed?
- A dependency-free Round-1 pipeline: SAR feature engineering → ridge
linear-unmixing model → reconciled, schema-valid per-village acreage submission for Rice, Cotton, Maize, Bajra, Groundnut, plus an auto-detecting official-data evaluation mode.
- On a synthetic Kharif benchmark (seed-fixed): 12/12 checks pass; MSE
15434.42 ha²; 59.1% skill vs the naive column-mean baseline; pooled R² 0.5887; total cultivated-area correlation 0.897.
2. What evidence supports each claim?
verification-report.json(+.md) and the raw run copy
proof/evidence/verify.log / proof/evidence/verification-report.json.
proof/CLAIM_EVIDENCE.jsonmaps every claim → source, method, command,
dataset, timestamp, artifact, status.
proof/EXECUTION_TRACE.jsonrecords each command, exit code, and stdout
sha256; proof/ARTIFACT_MANIFEST.json + proof/CHECKSUMS.json pin every file.
3. Can an independent engineer reproduce this claim?
Yes. proof/REPRODUCE.md gives exact commands; everything is seeded and dependency-free (Node 18+). node verify.mjs reproduces the table; node tools/forge-proof-verify.mjs --outcome delivery-package/sar-crop-acreage re-checks every checksum.
4. What assumptions were made?
- SAR backscatter mixes linearly in power space, weighted by each cover's
area within a village → acreage is a linear-unmixing problem (motivates ridge).
- The synthetic generator's per-crop seasonal signatures, noise model (speckle
ENL, soil-moisture, measurement), and 8 land-cover composition approximate village-level zonal means. These are modelling assumptions, not measurements.
- Village area is known and bounds each crop area (used as a physical constraint).
5. What limitations exist?
See proof/LIMITATIONS.md (authoritative). Headline: all metrics are synthetic; no official dataset; Stage A (raster→zonal) not executed; CV estimates, not a leaderboard score.
6. What seams exist? (DISCLOSED_SEAM)
- Official competition dataset absent → no leaderboard MSE produced.
- Raster→zonal-statistics (Stage A) documented, not executed (needs GDAL/
rasterio/SNAP/GEE).
- The "official" verification run used a synthetic-derived fixture
(tools/synth-to-official.mjs) — proves the path executes, not real accuracy.
7. What was actually executed?
node verify.mjs→ 12 structural+evaluation checks on synthetic data
(deterministic). Raw output: proof/evidence/verify.log.
- Official path on a synthetic-derived fixture → 16/16 checks + submission
emission. Raw output: proof/evidence/verify-official-fixture.{json,log}, params in proof/evidence/fixture-params.json.
8. What was inferred (not directly executed)?
- Real-world field accuracy is inferred to be unknown — not measured. The
synthetic skill (R² 0.59) bounds implementation correctness on the modelled mixing problem, and does not transfer numerically to real data.
- Stage A correctness on real GeoTIFFs is inferred from the documented method
and the unit-checked ingestion round-trip, not from a real run.
9. What remains unverified?
- Any metric on real official data (no dataset).
- Calibration/speckle/zonal code paths against real rasters.
- Deployment, security, monitoring, load, and operational behaviour (not run,
not claimed).
10. What evidence would invalidate the claim?
- A
node verify.mjsrun not yielding 12/12 or different numbers (drift/
environment difference).
- Editing
src/synth.mjsor the seed (synthetic numbers are conditional on them). - Treating synthetic skill as real-data accuracy (explicitly not claimed).
- On real data: out-of-fold CV MSE failing the skill/R²/constraint thresholds in
proof/VERIFY.md.
Pre-written hostile objections and responses: proof/AUDITOR_OBJECTIONS.md.