Overview
Forge-CRS — Autonomous Cyber Reasoning System
Forge-CRS is a working Cyber Reasoning System (CRS): software that, with no human in the loop, discovers a vulnerability in a piece of code, exploits it (produces a minimal proof-of-vulnerability), patches the source, and proves the patch both closes the hole and preserves behaviour.
This is the same loop DARPA's AI Cyber Challenge (AIxCC) was created to push on: *find-then-fix at machine speed*. Forge-CRS implements that loop end-to-end and runs it as a single command against a benchmark of real-world vulnerability classes.
DISCOVER ──▶ EXPLOIT ──▶ PATCH ──▶ VERIFY
(fuzz) (triage/PoV) (rewrite) (PoV neutralized + regression held)
What it actually does (live, executed)
Running node bin/crs.mjs run autonomously processes a benchmark of five seeded-but-real vulnerability classes and, for each one:
| Stage | Technique |
|---|---|
| Discover | Coverage-guided fuzzing using real V8 block coverage as feedback, with dictionary + structure-aware mutation. New-coverage inputs are kept in the corpus (AFL/libFuzzer principle). |
| Detect | A multi-signal crash oracle per bug class: prototype-pollution canary, path-containment, shell-metacharacter, wall-clock hang (ReDoS, run in a sandboxed worker), and unhandled out-of-bounds. |
| Exploit | Crash minimization (delta-debugging / binary search) to a tight PoV, then independent CWE classification. |
| Patch | Semantic patch synthesis — a per-class source rewrite (key-guard, base-dir containment, argv-not-shell, linear regex, bounds clamp). |
| Verify | The patch is accepted only if the PoV is neutralized *and* every functional regression case still passes. |
The benchmark (CWE-classed, all from the real-world OSS bug catalogue):
| Target | CWE | Class |
|---|---|---|
config-merge | CWE-1321 | Prototype Pollution |
path-store | CWE-22 | Path Traversal |
task-runner | CWE-78 | OS Command Injection |
regex-validate | CWE-1333 | ReDoS / catastrophic backtracking |
binary-reader | CWE-125 | Out-of-bounds Read |
Latest verified run: 5/5 discovered, 5/5 classified, 5/5 remediated, deterministic across identical seeds, in a few seconds of wall-clock. See verification-report.md.
What it does NOT do (honest scope)
Forge-CRS is a faithful, fully-working microcosm of an AIxCC-style CRS — not a competition-grade system. Read certification-report.md for the full disclosure. In short:
- The executed pipeline operates on the JavaScript/Node language adapter.
The engine (registry/adapter interface) is language-agnostic, but the C/C++/Java adapters that AIxCC actually scores — native sanitizers (ASan/UBSan), libFuzzer/AFL++ harnessing — are architectural seams, not live in this package.
- Patch synthesis uses bug-class repair strategies, not free-form
program repair of novel/unknown bugs.
- It runs at single-file benchmark scale, not whole-repo OSS scale.
Quick start
cd app
node bin/crs.mjs run # run the autonomous campaign + print the table
node bin/crs.mjs run --report # also write .work/campaign-report.json
node bin/crs.mjs run --target path-store # one target
node ../verify.mjs # full MUST_PASS verification (writes the report)
Requires Node.js ≥ 20 (uses node:inspector precise coverage and worker_threads). Zero external dependencies.
See user-guide.md for how to read the output and add your own target, and run-deploy-instructions.md to wire it into CI.