Outcome-as-a-Service · LIBRARY / MODULE · CERTIFIED

Agent Memory Manager

Hierarchical (hot/warm/cold) memory for long-running AI agents and fleets: Postgres+pgvector retrieval, LLM summarization hooks, time-decay / relevance / eviction policies, fleet sync, logs + metrics.

Verification 100% (30 / 30) Evidence Grade A Trust 93/100 CERTIFIED

What it does

A drop-in memory layer for long-running agents. Hot/warm/cold tiers keep the working set bounded; vector search finds the relevant past; summaries compress old threads.

Hierarchical tiers

Hot/warm/cold with per-tier decay half-life, capacity, and importance floor; automatic promotion/demotion/eviction.

Postgres + pgvector

Vector retrieval over real pgvector (embedded PGlite in tests; external server via pg through the same SQL).

Policies

Time decay, relevance (cosine + recency/importance/frequency/tags), eviction (LRU + importance).

Summarization

Local extractive default; Claude/LLM and hosted-embedding hooks behind the same interfaces.

Fleet sync

Optional event bus replicates fleet-scoped memories across agents; pluggable broker interface.

Observable

Structured logs + counters/gauges/latency histograms; live per-tier sizes via stats().

Verified results

From node verify.mjs over a seeded synthetic corpus on BOTH the in-memory store and a real Postgres+pgvector engine (PGlite). The corpus is synthetic — see limitations.

30/30
Checks passed
100.0%
In-memory precision@1
100.0%
pgvector precision@1
100.0%
pgvector precision@5
100.0%
In-mem vs pgvector top-1 agreement
2e-8
pgvector vs JS cosine error
144
Synthetic memories
12
Topics (answer key)

Proof posture

Assigned by the Proof Layer; reproduced here from proof/PROOF_DECISION.json.

FieldValue
Certification stateCERTIFIED
Evidence GradeA
Trust Score93 / 100
VerificationPASS — 30 / 30 checks
ReproductionReproducible (seed-deterministic, no network)
Disclosed seams6

Disclosed seams

  • SYNTHETIC INPUT: the retrieval benchmark corpus is generated by a seeded PRNG with a known topic answer key (verify.mjs). Reported precision is against that synthetic key, NOT real production text; absolute accuracy on real agent traffic will differ. This is the blocking gap for an official benchmark / PRODUCTION_VALIDATED.
  • DISCLOSED_SEAM: the default embeddings are a local, deterministic hashing model (lexical, not learned-semantic). Cosine similarity tracks token overlap. A hosted semantic embedding model can be plugged in via RemoteEmbeddingProvider but is NOT exercised here.
  • DISCLOSED_SEAM: the default summarizer is a local extractive (frequency-based) summarizer. ClaudeSummarizer (hosted LLM) is provided behind the same interface but requires a network + ANTHROPIC_API_KEY and is NOT exercised here.
  • LIVE INFRASTRUCTURE (in-process): Postgres + pgvector run in-process via PGlite (WASM). The identical SQL/pgvector code path runs against an external Postgres server through node-postgres (pg) — a wire-compatible disclosed seam not exercised in this run.
  • DISCLOSED_SEAM: fleet sync is exercised over an in-process EventEmitter bus. A distributed broker (Redis/NATS/Kafka) implementing the same SyncBus interface is a disclosed seam, not exercised here.
  • DISCLOSED_SEAM: the cross-encoder reranker (LocalCrossEncoderReranker) requires the optional @xenova/transformers dependency + a one-time model download. verify.mjs exercises the reranker INTERFACE with a deterministic fake; the real MiniLM cross-encoder is measured separately in the official BEIR/SciFact benchmark (bench/beir-scifact.mjs), whose results are recorded in officialBenchmark.

Proof artifacts & documents

Every claim traces to evidence. The Proof Layer (IRS_AUDITOR) has final authority over the certification state.