US SLED Contact Directory — Education layer
Architecture
Architecture — US SLED Contact Directory
Pipeline
NCES CCD LEA Directory (public JSON API, Urban Institute portal)
│ 51 HTTPS requests (one per state + DC), provenance captured
▼
src/collect.mjs ── normalize via src/schema.mjs ──► dataset/
│ ├─ contacts.json / contacts.csv
│ ├─ by-state/<ST>.{json,csv}
│ ├─ summary.json
│ └─ manifest.json (source URL, HTTP status, sha256, ts)
│ evidence/
│ ├─ fetch-log.json
│ └─ raw-sample-{AK,DC}.json
▼
verify.mjs ──► verification-report.{json,md} (17 MUST_PASS checks, offline)
build-site.mjs ──► public/index.html (self-contained, grouped by state)
Modules
src/fips.mjs— the 50 states + DC (FIPS ↔ USPS ↔ name) and the CCD
agency_type → label map.
src/schema.mjs— the canonical column order, raw→record normalization,
phone formatting, CSV serialization, and the by-state grouping helper. Shared by the collector, verifier, and site builder so all three agree by construction.
src/collect.mjs— fetches each jurisdiction (with retry + pagination), sets
a User-Agent (the API serves a bot-challenge page to the default Node UA), captures provenance, retains raw samples, and writes the dataset.
verify.mjs— deterministic, offline integrity + completeness + no-fabrication
checks; also rebuilds the site and confirms it embeds the directory.
build-site.mjs— embeds a compact per-state payload into a single HTML file
with a state filter + search.
Design choices
- Provenance over trust. Every record points at the exact dataset URL it came
from; the manifest records HTTP status + sha256 + timestamp; raw responses are retained. Verification re-hashes the dataset against the manifest.
- No fabrication, mechanically enforced. The four fields the source cannot
supply are carried but empty, and verify.mjs fails if any is non-empty.
- Single source, pinned year. Keeps v1 deterministic and reproducible; broader
SLED breadth + enrichment are disclosed, sourced seams for v2.