Architecture
Forge-CRS — Architecture
Forge-CRS is organized as a language-agnostic engine plus a registry of target knowledge. Discovery, exploitation, and repair are kept strictly independent so no stage can "cheat" using another stage's knowledge.
┌───────────────────────────────────────────┐
│ registry.mjs │
│ per-target: makeArgs · oracle · patch · │
│ regression · seeds · mutateKind · kind │
└───────────────────────────────────────────┘
│ (knowledge)
┌─────────────┬───────────────────┼────────────────────┬───────────────┐
▼ ▼ ▼ ▼ ▼
┌─────────┐ ┌───────────┐ ┌────────────┐ ┌───────────┐ ┌───────────┐
│ fuzzer │─▶│ oracle │ ───▶ │ triage │ ────▶ │ patcher │─▶ │ validator │
│ DISCOVER│ │ DETECT │ │ EXPLOIT │ │ PATCH │ │ VERIFY │
└────┬────┘ └────────────┘ └────────────┘ └───────────┘ └─────┬─────┘
│ │
coverage.mjs (V8 block coverage) PoV-dead + regression-green
mutators.mjs (string/buffer/json)
execute.mjs (in-process + sandboxed worker)
Components
prng.mjs— seedable mulberry32 RNG. Every stochastic choice flows
through it, which is what makes a whole campaign reproducible from a seed.
coverage.mjs— connects an in-process inspectorSessionand turns on
V8 *precise, block-level* coverage. After each execution it reports how many code ranges were reached for the first time. That novelty signal is the "guided" in coverage-guided fuzzing. Precise coverage only instruments scripts compiled after it is enabled, so the fuzzer starts coverage before importing the target.
mutators.mjs— three mutator families (string havoc + token dictionary,
buffer byte-havoc + length-field bait, structure-aware JSON generation over a key pool that includes dangerous keys). The dictionary is generic and security-relevant; the fuzzer is never told which token solves which target.
execute.mjs—execInProcess(capture return or thrown error) and
runWithTimeout (run in a worker with a hard wall-clock budget; a timeout is the ReDoS crash signal).
fuzzer.mjs— the DISCOVER loop. In-process targets use coverage feedback
to grow the corpus; worker targets use a time-bounded loop. Stops at the first oracle-confirmed crash and returns the raw crashing input.
triage.mjs— minimizes the crash (per-shape delta-debugging / binary
search) into a tight PoV and classifies the signal to a CWE *independently* of the target's ground-truth label.
patcher.mjs— applies the registry's semantic patch (a precise source
rewrite) to a throwaway copy of the target so the original is never mutated and the patch is never trusted until proven.
validator.mjs— re-runs the PoV against the patched code (must no longer
trip the oracle) and the full functional regression suite (must all pass). A patch that breaks behaviour is rejected.
crs.mjs— the orchestrator that drives all stages per target and
summarizes the campaign.
Why the stages are decoupled
A real CRS must not "solve" a bug by teaching the patcher exactly which input the fuzzer used. Here:
- the fuzzer only sees mutators + coverage; it does not know the bug class;
- the oracle independently decides whether an execution is a real
vulnerability;
- the classifier maps the crash to a CWE without reading the ground truth
(the campaign then checks they agree — a real accuracy signal);
- the patch is validated by re-deriving the PoV against patched code, not by
trusting that the rewrite "should" work.
This is what lets the verification claim — *5/5 discovered, classified, and remediated, deterministically* — actually mean something.