SAR Multi-Crop Acreage Estimator

Overview

← Back to outcome

SAR Multi-Crop Acreage Estimator — Round 1

A complete, dependency-free SAR → crop-acreage pipeline for Round 1 of a SAR-based agricultural-intelligence challenge: *Sown Area Progression & Multi-Crop Acreage Estimation*. It estimates the cultivated hectares of **Rice, Cotton, Maize, Bajra, and Groundnut** per village from multi-temporal X-band SAR imagery, and emits the exact competition submission file. Scoring metric: MSE across all crop columns and all villages (lower is better).

Honest scope. The official competition tiles/labels are not in this
workspace, so the pipeline runs and is verified end-to-end on a
*physically-motivated synthetic Kharif benchmark*, with a documented seam to
drop in real data. See intake/outcome-contract.md and
certification-report.md.

Quick start

cd delivery-package/sar-crop-acreage
node run.mjs          # synthetic demo: train, hold out, write data/submission.csv, report MSE
node verify.mjs       # run the full verification suite

No installs. Pure Node (tested on Node 24). No network access required.

What's inside

StageFileWhat it does
Calibration & specklesrc/sar.mjsdB ↔ linear power, multi-temporal Lee filter, temporal stacking
Agricultural extentsrc/sar.mjstemporal-variability cropland indicator
Feature engineeringsrc/sar.mjsper-village temporal stats, phenology windows, absolute backscattered-power features
Estimator (primary)src/ridge.mjsmulti-output ridge linear unmixing (physically matched, closed-form)
Estimator (residual)src/forest.mjsoptional random forest for non-linear residual
Pipelinesrc/pipeline.mjstrain → predict → reconcile → MSE/R² → submission CSV
Synthetic benchmarksrc/synth.mjsper-crop seasonal signatures, area-weighted linear-power mixing, realistic noise
Real-data seamsrc/ingest.mjslong-format zonal-stats CSV → model-ready stacks
CLIrun.mjssynthetic or real-data runs
Verificationverify.mjs11 screening checks + benchmark report
Explorer (UI)public/tool.htmlinteractive forward/inverse SAR season visualizer

The method in one paragraph

X-band backscatter mixes across land covers in linear power, weighted by each cover's area. Multiplying a village's mean linear power by its area yields a quantity that is *linear* in the unknown crop-area vector, so estimating crop acreage is a linear unmixing problem — addressed here with regularized least squares (ridge), a physically-matched and numerically well-conditioned estimator. Phenology-resolved features (early/mid/late season power, temporal slopes, cross/co ratios) let the model separate crops with different seasonal calendars (e.g. rice's flood-driven low early co-pol vs. maize's late cross-pol peak).

Using real competition data

The model transfers to real inputs through one ingestion seam. See run-deploy-instructions.md for the full raster → zonal-stats recipe. In short:

# After producing a per-village zonal-stats CSV (village_id,date,pol,backscatter_db,village_area_ha)
node run.mjs --zonal path/to/zonal.csv --labels path/to/train-labels.csv
# -> data/submission.csv for the unlabelled (test) villages
node run.mjs --zonal zonal.csv --labels train-labels.csv --cv   # honest CV-scored MSE on labelled villages

Same verification structure, official data

verify.mjs runs one verification structure against two benchmarks:

  • Synthetic (always) — passes anywhere with no data.
  • Official (auto-detected) — drop the official files into data/official/:
  • data/official/zonal.csv (long-format zonal statistics)
  • data/official/train-labels.csv (ID,Rice_ha,…,Groundnut_ha)

Then node verify.mjs runs the identical evaluation checks on the real data via out-of-fold k-fold cross-validation (an unbiased, leaderboard-aligned MSE — the test labels are hidden), and writes a real data/official/submission.csv. The report flips to `evaluatedOn: "official + synthetic"`.

Want to exercise the official path before the real files arrive? Generate an official-format fixture from the physics model:

node tools/synth-to-official.mjs data/official --n 1000 --T 12 --test 0.2
node verify.mjs        # now evaluates the official path too

Reuse across rounds

The preprocessing, feature engineering, and ingestion seam are written to be reused and refined in later rounds (crop condition, yield, production), as the challenge intends. The acreage maps produced here are the foundation layer.