Forge-CRS — Autonomous Cyber Reasoning System

Overview

Forge-CRS — Autonomous Cyber Reasoning System

Forge-CRS is a working Cyber Reasoning System (CRS): software that, with no human in the loop, discovers a vulnerability in a piece of code, exploits it (produces a minimal proof-of-vulnerability), patches the source, and proves the patch both closes the hole and preserves behaviour.

This is the same loop DARPA's AI Cyber Challenge (AIxCC) was created to push on: *find-then-fix at machine speed*. Forge-CRS implements that loop end-to-end and runs it as a single command against a benchmark of real-world vulnerability classes.

DISCOVER ──▶ EXPLOIT ──▶ PATCH ──▶ VERIFY
 (fuzz)     (triage/PoV)  (rewrite)  (PoV neutralized + regression held)

What it actually does (live, executed)

Running node bin/crs.mjs run autonomously processes a benchmark of five seeded-but-real vulnerability classes and, for each one:

Stage	Technique
Discover	Coverage-guided fuzzing using real V8 block coverage as feedback, with dictionary + structure-aware mutation. New-coverage inputs are kept in the corpus (AFL/libFuzzer principle).
Detect	A multi-signal crash oracle per bug class: prototype-pollution canary, path-containment, shell-metacharacter, wall-clock hang (ReDoS, run in a sandboxed worker), and unhandled out-of-bounds.
Exploit	Crash minimization (delta-debugging / binary search) to a tight PoV, then independent CWE classification.
Patch	Semantic patch synthesis — a per-class source rewrite (key-guard, base-dir containment, argv-not-shell, linear regex, bounds clamp).
Verify	The patch is accepted only if the PoV is neutralized and every functional regression case still passes.

The benchmark (CWE-classed, all from the real-world OSS bug catalogue):

Target	CWE	Class
`config-merge`	CWE-1321	Prototype Pollution
`path-store`	CWE-22	Path Traversal
`task-runner`	CWE-78	OS Command Injection
`regex-validate`	CWE-1333	ReDoS / catastrophic backtracking
`binary-reader`	CWE-125	Out-of-bounds Read

Latest verified run: 5/5 discovered, 5/5 classified, 5/5 remediated, deterministic across identical seeds, in a few seconds of wall-clock. See verification-report.md.

What it does NOT do (honest scope)

Forge-CRS is a faithful, fully-working microcosm of an AIxCC-style CRS — not a competition-grade system. Read certification-report.md for the full disclosure. In short:

The executed pipeline operates on the JavaScript/Node language adapter.

The engine (registry/adapter interface) is language-agnostic, but the C/C++/Java adapters that AIxCC actually scores — native sanitizers (ASan/UBSan), libFuzzer/AFL++ harnessing — are architectural seams, not live in this package.

Patch synthesis uses bug-class repair strategies, not free-form

program repair of novel/unknown bugs.

It runs at single-file benchmark scale, not whole-repo OSS scale.

Quick start

cd app
node bin/crs.mjs run            # run the autonomous campaign + print the table
node bin/crs.mjs run --report   # also write .work/campaign-report.json
node bin/crs.mjs run --target path-store   # one target
node ../verify.mjs              # full MUST_PASS verification (writes the report)

Requires Node.js ≥ 20 (uses node:inspector precise coverage and worker_threads). Zero external dependencies.

See user-guide.md for how to read the output and add your own target, and run-deploy-instructions.md to wire it into CI.