Agent Memory Manager
Verify
VERIFY — agent-memory-manager
How verification runs and what each check asserts. The harness is verify.mjs; the unit suite is under test/ (compiled to dist/test).
How it runs
node verify.mjs performs, in order:
- Build — compiles TypeScript with
tsc. - Unit suite — runs
node --testover the compiled tests and asserts zero
failures.
- Synthetic corpus — a seeded PRNG (
mulberry32, seed20260625) builds N
topics, each with a unique vocabulary, so same-topic texts share tokens (high cosine) and cross-topic texts do not. This yields a known retrieval answer key.
- Dual-store benchmark — the same corpus is loaded into the
InMemoryStore
and a live PGlite Postgres+pgvector engine; retrieval precision@k is measured against the answer key on both.
- Policy/behaviour checks — decay demotion, tier eviction, fleet sync,
determinism, and metrics.
What each check asserts
| Check | Assertion |
|---|---|
| Unit suite passes | node --test reports 0 failing tests (24 tests). |
| In-memory precision@1 == 1.0 | Top hit for each topic query belongs to that topic. |
| In-memory precision@5 ≥ 0.95 | ≥95% of the top-5 belong to the queried topic. |
| Deterministic retrieval | Re-running the same query yields identical ordering. |
| pgvector precision@1 / @5 | Same thresholds, computed over real pgvector. |
| pgvector vs JS cosine | 1 - (embedding <=> q) agrees with the JS cosine within 1e-5. |
| pgvector persistence | Row count in the DB equals the number of stored memories. |
| Store parity | In-memory and pgvector pick the same top-1 topic for every query. |
| summarizeThread | Produces a non-empty summary and persists a summary-typed memory that is retrievable. |
| getContextForTask budget | Assembled context respects the character budget and ranks the summary first. |
| Time decay demotion | A hot memory whose decayed importance drops below the hot floor is demoted to warm. |
| Hot capacity | After overflow, the hot tier holds ≤ its configured capacity. |
| Cold eviction | The cold tier is capped and memory_evict_total{tier=cold} is emitted. |
| Fleet sync | Fleet-scoped memories replicate to a peer; local-scoped ones do not. |
| Metrics | Counters and latency histograms are recorded. |
| Reproducible corpus | The same seed regenerates an identical corpus fingerprint. |
Outputs
verification-report.json/.md— machine- and human-readable results.evidence/verification-results.json— copy of the report.proof/evidence/verify.log— raw stdout captured by the Proof Layer.
A non-zero exit code is returned if any check fails.