Overview
SAR Multi-Crop Acreage Estimator — Round 1
A complete, dependency-free SAR → crop-acreage pipeline for Round 1 of a SAR-based agricultural-intelligence challenge: *Sown Area Progression & Multi-Crop Acreage Estimation*. It estimates the cultivated hectares of **Rice, Cotton, Maize, Bajra, and Groundnut** per village from multi-temporal X-band SAR imagery, and emits the exact competition submission file. Scoring metric: MSE across all crop columns and all villages (lower is better).
Honest scope. The official competition tiles/labels are not in this
workspace, so the pipeline runs and is verified end-to-end on a
*physically-motivated synthetic Kharif benchmark*, with a documented seam to
drop in real data. Seeintake/outcome-contract.mdandcertification-report.md.
Quick start
cd delivery-package/sar-crop-acreage
node run.mjs # synthetic demo: train, hold out, write data/submission.csv, report MSE
node verify.mjs # run the full verification suite
No installs. Pure Node (tested on Node 24). No network access required.
What's inside
| Stage | File | What it does |
|---|---|---|
| Calibration & speckle | src/sar.mjs | dB ↔ linear power, multi-temporal Lee filter, temporal stacking |
| Agricultural extent | src/sar.mjs | temporal-variability cropland indicator |
| Feature engineering | src/sar.mjs | per-village temporal stats, phenology windows, absolute backscattered-power features |
| Estimator (primary) | src/ridge.mjs | multi-output ridge linear unmixing (physically matched, closed-form) |
| Estimator (residual) | src/forest.mjs | optional random forest for non-linear residual |
| Pipeline | src/pipeline.mjs | train → predict → reconcile → MSE/R² → submission CSV |
| Synthetic benchmark | src/synth.mjs | per-crop seasonal signatures, area-weighted linear-power mixing, realistic noise |
| Real-data seam | src/ingest.mjs | long-format zonal-stats CSV → model-ready stacks |
| CLI | run.mjs | synthetic or real-data runs |
| Verification | verify.mjs | 11 screening checks + benchmark report |
| Explorer (UI) | public/tool.html | interactive forward/inverse SAR season visualizer |
The method in one paragraph
X-band backscatter mixes across land covers in linear power, weighted by each cover's area. Multiplying a village's mean linear power by its area yields a quantity that is *linear* in the unknown crop-area vector, so estimating crop acreage is a linear unmixing problem — addressed here with regularized least squares (ridge), a physically-matched and numerically well-conditioned estimator. Phenology-resolved features (early/mid/late season power, temporal slopes, cross/co ratios) let the model separate crops with different seasonal calendars (e.g. rice's flood-driven low early co-pol vs. maize's late cross-pol peak).
Using real competition data
The model transfers to real inputs through one ingestion seam. See run-deploy-instructions.md for the full raster → zonal-stats recipe. In short:
# After producing a per-village zonal-stats CSV (village_id,date,pol,backscatter_db,village_area_ha)
node run.mjs --zonal path/to/zonal.csv --labels path/to/train-labels.csv
# -> data/submission.csv for the unlabelled (test) villages
node run.mjs --zonal zonal.csv --labels train-labels.csv --cv # honest CV-scored MSE on labelled villages
Same verification structure, official data
verify.mjs runs one verification structure against two benchmarks:
- Synthetic (always) — passes anywhere with no data.
- Official (auto-detected) — drop the official files into
data/official/: data/official/zonal.csv(long-format zonal statistics)data/official/train-labels.csv(ID,Rice_ha,…,Groundnut_ha)
Then node verify.mjs runs the identical evaluation checks on the real data via out-of-fold k-fold cross-validation (an unbiased, leaderboard-aligned MSE — the test labels are hidden), and writes a real data/official/submission.csv. The report flips to `evaluatedOn: "official + synthetic"`.
Want to exercise the official path before the real files arrive? Generate an official-format fixture from the physics model:
node tools/synth-to-official.mjs data/official --n 1000 --T 12 --test 0.2
node verify.mjs # now evaluates the official path too
Reuse across rounds
The preprocessing, feature engineering, and ingestion seam are written to be reused and refined in later rounds (crop condition, yield, production), as the challenge intends. The acreage maps produced here are the foundation layer.