Forge-CRS — Autonomous Cyber Reasoning System

Run & Deploy

← Back to outcome

Forge-CRS — Run & Deploy

Local run

cd delivery-package/aixcc-cyber-reasoning-system/app
node bin/crs.mjs run            # autonomous campaign
node ../verify.mjs              # MUST_PASS verification + reports
  • Runtime: Node.js ≥ 20 (uses node:inspector/promises precise coverage

and node:worker_threads). No npm install — zero external dependencies.

  • Outputs: .work/ holds throwaway patched copies and (with --report)

campaign-report.json. .work/ is disposable and git-ignored.

  • Verification artifacts: verify.mjs writes verification-report.json,

delivery-package/verification-report.md, and machine-readable evidence under delivery-package/evidence/.

Continuous-integration usage

Forge-CRS is built to gate a build: it exits non-zero unless every benchmark target is fully remediated and the run is deterministic.

# .github/workflows/crs.yml
name: cyber-reasoning
on: [push, pull_request]
jobs:
  crs:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: '20' }
      - name: Run CRS verification
        working-directory: delivery-package/aixcc-cyber-reasoning-system
        run: node verify.mjs

For a regression gate on application code, register the application's modules as targets in app/src/registry.mjs and run node bin/crs.mjs run --quiet; a non-zero exit blocks the merge.

Tuning

  • Search budget: runCampaign({ maxIters: { 'in-process': 20000, worker: 400 } })

raises the per-kind iteration cap for harder targets.

  • Seed: change --seed to explore different fuzzing trajectories; a fixed

seed guarantees reproducibility for audits.

  • Timeout: each worker target sets timeoutMs (the ReDoS wall-clock

budget) in its registry entry.

Scaling notes (the seams)

To move from the benchmark to real OSS at scale you would implement additional language adapters behind the same registry interface:

  1. C/C++ — compile targets with ASan/UBSan, harness with libFuzzer/AFL++,

map sanitizer reports to the oracle signals.

  1. Java/JVM — Jazzer harnesses + JVM sanitizers.
  2. Corpus + repo scale — persistent corpora, parallel workers, and a build

system integration to fuzz the whole repository rather than one module.

These are deliberately out of scope for this package and are disclosed as non-live in certification-report.md.