Cognitive Memory

Memory benchmarks (full N=500, gpt-4o reader): 85.6% on LongMemEval-S at $0.0090 per correct, +1.4 points above Mastra Observational Memory (84.23%). 70.2% on LongMemEval-M on the 1.5M-token / 500-session haystack variant — the only open-source library on the public record above 65% on M with publicly reproducible methodology. Competitive with the strongest published M results in the LongMemEval paper (Wu et al., ICLR 2025: round Top-5 65.7%, session Top-5 71.4%, round Top-10 72.0%). Benchmarks · Run JSONs · SOTA writeup

Why memory should forget

"It's a poor sort of memory that only works backwards."

— Through the Looking-Glass, Lewis Carroll, 1871

Most agent memory systems forget nothing. They embed every message, store the vectors, and at retrieval time return whatever is closest in cosine space. This works for a few thousand turns. Past that, the system is doing something a human mind explicitly evolved not to do: treating every recorded experience as equally available, equally trustworthy, and equally relevant. The literature on biological memory is a long argument that forgetting is not a bug. It is the mechanism by which what mattered yesterday continues to matter today.

The cognitive memory system in AgentOS is built on that argument. Encoding strength is set per-trace, modulated by the personality traits of the agent doing the encoding and by the emotional intensity of the moment (Brown & Kulik, 1977 on flashbulb memories; Yerkes & Dodson, 1908 on the inverted-U arousal curve). Strength then decays exponentially with time on Hermann Ebbinghaus's 1885 forgetting curve S(t) = S₀ · e^(-Δt / stability), accelerated by interference from new similar memories and slowed by successful retrieval (the desirable-difficulty effect — harder retrievals grow stability more). Working memory is bounded by Baddeley's slot model of seven-plus-or-minus-two, modulated by traits. Retrieval composites six signals — vector similarity, current strength, recency, emotional congruence with the agent's mood, graph spreading-activation in the ACT-R tradition (Anderson, 1983), and importance. The graph itself learns: co-retrieval of two traces tightens the edge between them via Hebbian weight updates ("neurons that fire together wire together").

The result is a memory that behaves more like a person remembering. The agent forgets the irrelevant. It holds onto what hit it hard. It pulls the thing that's adjacent in concept-space, not just the thing that's adjacent in vector-space. And — because every mechanism is HEXACO-modulated — the same input encodes differently depending on who is doing the remembering.

Eight cognitive mechanisms layered on top

On top of the encoding/decay/retrieval substrate, the runtime ships eight optional neuroscience-grounded mechanisms — reconsolidation, retrieval-induced forgetting, involuntary recall, metacognitive feeling-of-knowing, temporal gist, schema encoding, source-confidence decay, and emotion regulation. All HEXACO-personality-modulated and individually configurable via cognitiveMechanisms on CognitiveMemoryConfig. See the Cognitive Mechanisms Implementation Guide for hook points, APIs, and testing.

Overview

The Cognitive Memory System models memory as a dynamic, personality-modulated process rather than a flat key-value store:

Encoding is shaped by the agent's HEXACO personality traits and current emotional state (PAD model: valence, arousal, dominance)
Forgetting follows the Ebbinghaus exponential decay curve, with retrieval-induced reinforcement via spaced repetition
Retrieval combines six weighted signals (strength, embedding similarity, recency, emotional congruence, graph activation, importance) into a composite score
Working memory enforces Baddeley's slot-based capacity limits (7±2), modulated by traits
Consolidation runs periodically to prune weak traces, merge clusters into schemas, resolve contradictions, and feed observations back into long-term storage

The system is composable. Core encoding/decay/retrieval (Batch 1) runs without any LLM calls. Advanced features (Batch 2 — observer, reflector, graph, consolidation) activate automatically when their config is provided and degrade gracefully when absent. You can run the entire stack against a local SQLite + HNSW backend, or scale it to Postgres + Neo4j without changing any callsite.

Cognitive science foundations

Each model below has a one-to-one analogue in the source. The point of the table is not to claim the runtime "uses" these papers in the loose sense — the point is that the constants, formulas, and weights you'll see in the code lines below come straight from this literature.

Model	Reference	Application in AgentOS
Multi-store memory	Atkinson & Shiffrin, 1968	Sensory input → working memory → long-term memory pipeline
Working memory model	Baddeley & Hitch, 1974; Baddeley 2003	Slot-based capacity limits (7±2) with activation levels
LTM taxonomy	Tulving, 1972	Episodic / semantic / procedural / prospective memory types
Forgetting curve	Ebbinghaus, 1885	`S(t) = S₀ · e^(-Δt / stability)` exponential decay
Arousal curve	Yerkes & Dodson, 1908	Encoding quality peaks at moderate arousal (inverted-U)
Flashbulb memories	Brown & Kulik, 1977	High-emotion events create vivid, persistent traces
Mood-congruent encoding	Bower, 1981	Content matching current mood valence encodes more strongly
Spreading activation	Anderson, 1983 (ACT-R)	BFS through associative graph with activation decay
Hebbian learning	Hebb, 1949	Co-retrieval strengthens graph edges
HEXACO personality	Ashton & Lee, 2007	Trait-driven encoding weights and memory capacity modulation
Source-monitoring framework	Johnson, Hashtroudi & Lindsay, 1993	Different memory sources decay at different rates (provenance-aware)
HyDE retrieval	Gao et al., 2022	Generate hypothetical answer, embed that, search for matches
GraphRAG	Microsoft Research, 2024	Entity-graph + community summaries for multi-hop retrieval
Generative agents	Park et al., 2023	Persona + memory + reflection as the long-running agent pattern
CoALA framework	Sumers et al., 2023	Cognitive architectures for language agents — episodic / semantic / procedural memory typology

Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                    CognitiveMemoryManager                          │
│                      (top-level orchestrator)                       │
└──┬──────┬──────┬──────┬──────┬──────┬──────┬──────┬───────────────┘
   │      │      │      │      │      │      │      │
   ▼      ▼      ▼      ▼      ▼      ▼      ▼      ▼
┌──────┐┌──────┐┌──────┐┌──────┐┌──────┐┌──────┐┌──────┐┌──────────┐
│Encod-││Decay ││Work- ││Memory││Prompt││Memory││Obser-││Consolida-│
│ ing  ││Model ││ ing  ││Store ││Assem-││Graph ││ver / ││tion Pipe-│
│Model ││      ││Memory││      ││bler  ││      ││Reflec││line      │
│      ││      ││      ││      ││      ││      ││tor   ││          │
└──┬───┘└──┬───┘└──┬───┘└──┬───┘└──┬───┘└──┬───┘└──┬───┘└──┬───────┘
   │       │       │       │       │       │       │       │
   ▼       ▼       ▼       ▼       ▼       ▼       ▼       ▼
┌──────┐┌──────┐┌──────┐┌──────────────┐┌──────┐┌──────┐┌──────────┐
│HEXACO││Ebbng-││Badde-││  IVectorStore││Spread││LLM   ││Prospec-  │
│Traits││haus  ││ley   ││  + IKnowledge││ing   ││Invok-││tive Mem- │
│+ PAD ││Curve ││Slots ││    Graph     ││Activ.││er    ││ory Mgr   │
└──────┘└──────┘└──────┘└──────────────┘└──────┘└──────┘└──────────┘

Per-turn data flow (GMI integration):

User Message arrives
encode()          — Create MemoryTrace from input (personality-modulated strength)
retrieve()        — Query vector store + score with 6-signal composite
assembleForPrompt — Token-budgeted context assembly → inject into system prompt
[LLM generates response]
observe()         — Feed response to observer buffer (Batch 2)
checkProspective  — Check time/event/context triggers (Batch 2)
runConsolidation   — Periodic background sweep (Batch 2, timer-based)

Memory Types

Based on Tulving's long-term memory taxonomy with extensions:

Type	Cognitive Model	AgentOS Usage	Example
`episodic`	Autobiographical events	Conversation events, interactions	"User asked about deployment on Tuesday"
`semantic`	General knowledge/facts	Learned facts, preferences, schemas	"User prefers TypeScript over Python"
`procedural`	Skills and how-to	Workflows, tool usage patterns	"To deploy, run the deployment pipeline"
`prospective`	Future intentions	Goals, reminders, planned actions	"Remind user about the PR review"

Memory Scopes

Each trace is scoped to control visibility and ownership:

Scope	Visibility	Persistence	Use Case
`thread`	Single conversation	Conversation lifetime	In-conversation working context
`user`	All conversations with a user	Long-term	User preferences, facts, history
`persona`	All users of a persona	Long-term	Persona's learned knowledge
`organization`	All agents in an org	Long-term	Shared organizational knowledge

Collections in the vector store are named {prefix}_{scope}_{scopeId} (default prefix: cogmem).

The MemoryTrace Envelope

Every memory is wrapped in a MemoryTrace — the universal envelope carrying content, provenance, emotional context, and decay parameters:

Field Group	Key Fields	Purpose
Identity	`id`, `type`, `scope`, `scopeId`	Classification and routing
Content	`content`, `structuredData`, `entities`, `tags`	The actual memory data
Provenance	`sourceType`, `sourceId`, `confidence`, `verificationCount`, `contradictedBy`	Source monitoring to prevent confabulation
Emotional Context	`valence`, `arousal`, `dominance`, `intensity`, `gmiMood`	PAD snapshot at encoding time
Decay Parameters	`encodingStrength` (S0), `stability` (tau), `retrievalCount`, `lastAccessedAt`	Ebbinghaus curve inputs
Spaced Repetition	`reinforcementInterval`, `nextReinforcementAt`	Interval doubling schedule
Graph	`associatedTraceIds`	Links to related traces
Lifecycle	`createdAt`, `updatedAt`, `consolidatedAt`, `isActive`	Timestamps and soft-delete flag

Source types: user_statement, agent_inference, tool_result, observation, reflection, external.

Encoding Model

Source: src/memory/core/encoding/EncodingModel.ts

Encoding determines how strongly a new input is committed to memory. The system combines four cognitive mechanisms:

1. HEXACO Personality -> Encoding Weights

Each HEXACO trait modulates attention to specific content features:

Trait	Attention Weight	Formula	Effect
Openness	`noveltyAttention`	`0.3 + O * 0.7`	High O notices novel, creative content
Conscientiousness	`proceduralAttention`	`0.3 + C * 0.7`	High C notices procedures, structure
Emotionality	`emotionalSensitivity`	`0.2 + E * 0.8`	High E amplifies emotional content
Extraversion	`socialAttention`	`0.2 + X * 0.8`	High X notices social dynamics
Agreeableness	`cooperativeAttention`	`0.2 + A * 0.8`	High A notices cooperation cues
Honesty	`ethicalAttention`	`0.2 + H * 0.8`	High H notices ethical/moral content

The composite attention multiplier starts at 0.5 and adds weighted bonuses for each detected content feature (0.10-0.15 each), plus a base 0.15 for contradictions and topic relevance.

2. Yerkes-Dodson Arousal Curve

Encoding quality peaks at moderate arousal (inverted U):

f(a) = 1 - 4 * (a - 0.5)^2

where a = arousal normalised to [0, 1]

Returns a multiplier in [0.3, 1.0], peaking at a = 0.5. Very low arousal (bored) and very high arousal (panicked) both impair encoding.

3. Mood-Congruent Encoding

Content whose emotional valence matches the current mood is encoded more strongly:

boost = 1 + max(0, currentValence * contentValence) * emotionalSensitivity * 0.3

Positive product means mood and content are congruent (both positive or both negative).

4. Flashbulb Memories

When emotional intensity exceeds the threshold (default: 0.8), the memory becomes a flashbulb memory:

Strength multiplier: 2.0x (default)
Stability multiplier: 5.0x (default)

These model the vivid, persistent memories formed during highly emotional events (Brown & Kulik, 1977).

Composite Encoding Strength

S₀ = min(1.0, base * arousalBoost * emotionalBoost * attentionMultiplier * congruenceBoost * flashbulbBoost)

Default base = 0.5. The stability (time constant for decay) is computed as:

stability = baseStabilityMs * (1 + S₀ * 6) * flashbulbStabilityMultiplier

Default baseStabilityMs = 3,600,000 (1 hour). Stronger memories are inherently more stable.

Content Feature Detection

The encoding model needs to know what features the content contains. Three detection strategies are available:

Strategy	Speed	Quality	LLM Calls	Best For
`keyword`	Fast	Moderate	0	Default; low-latency agents
`llm`	Slow	High	1 per encode	High-fidelity agents with budget
`hybrid`	Medium	High	Periodic	Best balance; keyword first, LLM re-classification during consolidation

Detected features (ContentFeatures): hasNovelty, hasProcedure, hasEmotion, hasSocialContent, hasCooperation, hasEthicalContent, hasContradiction, topicRelevance.

Configure via featureDetectionStrategy in CognitiveMemoryConfig.

Forgetting & Decay

Source: src/memory/core/decay/DecayModel.ts

Ebbinghaus Forgetting Curve

Memory strength decays exponentially over time:

S(t) = S₀ * e^(-dt / stability)

where:
  S₀       = initial encoding strength
  dt       = time elapsed since last access (ms)
  stability = time constant (ms); grows with each retrieval

Spaced Repetition

Each successful retrieval updates the trace via the desirable difficulty effect:

Difficulty bonus: max(0.1, 1 - currentStrength) — weaker memories get larger stability boosts
Diminishing returns: 1 / (1 + 0.1 * retrievalCount) — logarithmic saturation
Emotional bonus: 1 + intensity * 0.3 — emotional memories consolidate faster
Growth factor: (1.5 + difficultyBonus * 2.0) * diminish * emotionalBonus
Interval doubling: reinforcementInterval *= 2 after each retrieval

Interference

When a new trace overlaps with existing traces (cosine similarity > threshold, default 0.7):

Retroactive interference: New trace weakens old similar traces (strength reduction ~0.15 at similarity 1.0)
Proactive interference: Old traces impair new encoding (capped at 0.3 total reduction)

Pruning

Traces with currentStrength < pruningThreshold (default: 0.05) are soft-deleted during consolidation, unless their emotional intensity exceeds 0.3 (emotional memories are protected from pruning).

Lifecycle note: these retention/decay sweeps are now operational on the built-in vector stores that implement scanByMetadata(). Adapters without metadata-scan support still need provider-specific work before they can participate fully in lifecycle enforcement.

Retrieval Priority Scoring

Source: src/memory/core/decay/RetrievalPriorityScorer.ts

Retrieval combines six signals into a composite score:

Signal	Weight	Range	Computation
`strength`	0.25	0-1	`S₀ * e^(-dt / stability)`
`similarity`	0.35	0-1	Cosine similarity from vector search
`recency`	0.10	0-1	`(e^(-elapsed / halfLife)) / 0.2` (normalised)
`emotionalCongruence`	0.15	0-1	`max(0, moodValence * traceValence) / 0.25` (normalised)
`graphActivation`	0.10	0-1	Spreading activation score (0 without graph)
`importance`	0.05	0-1	`confidence * 0.5 + 0.5`

Composite score:

score = clamp(0, 1,
  w_str * strengthScore +
  w_sim * similarityScore +
  w_rec * recencyNorm +
  w_emo * emotionalNorm +
  w_graph * graphActivation +
  w_imp * importanceScore
)

Setting neutralMood: true in retrieval options disables emotional congruence bias (useful for factual lookups).

Tip-of-the-Tongue Detection

Traces with high vector similarity (>0.6) but low strength (<0.3) or low confidence (<0.4) are returned as PartiallyRetrievedTrace — the agent "almost" remembers them. These include suggestedCues (tags) to help the user provide more context.

Working Memory (Baddeley's Model)

Source: src/memory/core/working/CognitiveWorkingMemory.ts

Working memory is a slot-based, capacity-limited buffer that tracks what the agent is currently "thinking about."

Capacity

Base capacity follows Miller's number (7), modulated by personality:

High openness (>0.6): +1 slot (broader attention span)
High conscientiousness (>0.6): -1 slot (deeper focus per item)
Result clamped to [5, 9] (Miller's 7 plus/minus 2)

Slot Mechanics

Each WorkingMemorySlot tracks:

Field	Range	Purpose
`activationLevel`	0-1	How "in focus" this item is
`attentionWeight`	0-1	Proportional share of attention (normalised)
`rehearsalCount`	0+	Maintenance rehearsal bumps (+0.15 per rehearse)
`enteredAt`	Unix ms	When the trace entered working memory

Activation Lifecycle

Focus: New trace enters at initialActivation (default 0.8). If at capacity, lowest-activation slot is evicted first.
Rehearsal: rehearse(slotId) bumps activation by 0.15 (capped at 1.0).
Decay: Each turn, all activations decrease by activationDecayRate (default 0.1).
Eviction: Slots below minActivation (default 0.15) are evicted. The onEvict callback can encode evicted items back to long-term memory.

Prompt Formatting

formatForPrompt() outputs slots sorted by activation:

- [ACTIVE] mt_1234 (activation: 0.85)
- [fading] mt_1235 (activation: 0.52)
- [weak]   mt_1236 (activation: 0.20)

Memory Store

Source: src/memory/retrieval/store/MemoryStore.ts

The MemoryStore wraps IVectorStore + IKnowledgeGraph into a unified persistence layer:

Store: Embeds content via IEmbeddingManager, upserts into vector store, records as episodic memory in knowledge graph
Query: Vector search -> decay-aware scoring -> tip-of-the-tongue detection
Access tracking: Updates spaced repetition parameters on each retrieval
Soft delete: Sets isActive = false without removing from store

Collection Naming

Collections follow the pattern {prefix}_{scope}_{scopeId}:

cogmem_user_agent-123
cogmem_thread_conv-456
cogmem_persona_helper-bot
cogmem_organization_acme-org

Memory Graph

Source: src/memory/retrieval/graph/IMemoryGraph.ts

The IMemoryGraph interface abstracts over two backends:

Backend	Implementation	Use Case
`graphology`	`GraphologyMemoryGraph`	Dev/testing, in-memory, fast
`knowledge-graph`	`KnowledgeGraphMemoryGraph`	Production, wraps `IKnowledgeGraph`

Configure via graph.backend (default: 'knowledge-graph').

Edge Types

Edge Type	Meaning	Weight
`SHARED_ENTITY`	Traces mention the same entity	0.5
`TEMPORAL_SEQUENCE`	Traces created within 5 minutes	0.3
`SAME_TOPIC`	Traces share topic cluster	varies
`CONTRADICTS`	Traces contain conflicting information	varies
`SUPERSEDES`	One trace replaces another	varies
`CAUSED_BY`	Causal relationship	varies
`CO_ACTIVATED`	Traces retrieved together (Hebbian)	grows
`SCHEMA_INSTANCE`	Episodic trace is instance of semantic schema	0.6

Spreading Activation

Source: src/memory/retrieval/graph/SpreadingActivation.ts

Implements Anderson's ACT-R spreading activation model. Given seed nodes (top retrieval results), activation spreads through the graph to surface associated memories.

Algorithm (BFS)

Seed nodes start at activation = 1.0
Each hop: neighbor_activation = current * edge_weight * decayPerHop
Multi-path summation (capped at 1.0) — traces reachable by multiple paths get boosted
BFS with maxDepth (default 3) and activationThreshold (default 0.1) cutoffs
Results sorted by activation descending, capped at maxResults (default 20)

Configuration

Parameter	Default	Effect
`maxDepth`	3	Maximum hops from seed nodes
`decayPerHop`	0.5	Activation multiplier per hop
`activationThreshold`	0.1	Minimum activation to continue
`maxResults`	20	Maximum activated nodes returned

Hebbian Learning

After retrieval, co-retrieved memories are recorded via recordCoActivation(). This strengthens CO_ACTIVATED edges between memories that are frequently retrieved together, implementing the Hebbian rule: "neurons that fire together wire together."

The learning rate (default 0.1) controls how quickly edge weights grow.

Observer/Reflector System

Memory Observer

Source: src/memory/pipeline/observation/MemoryObserver.ts

The observer monitors accumulated conversation tokens via a buffer. When the threshold is reached (default: 30,000 tokens), it extracts concise observation notes via a persona-configured LLM.

Personality bias in observation:

High Trait	Observer Focus
Emotionality	Emotional shifts, tone changes, sentiment transitions
Conscientiousness	Commitments, deadlines, action items, structured plans
Openness	Creative tangents, novel ideas, exploratory topics
Agreeableness	User preferences, rapport cues, communication style
Honesty	Corrections, retractions, contradictions

Observation notes are typed: factual, emotional, commitment, preference, creative, correction.

Memory Reflector

Source: src/memory/pipeline/observation/MemoryReflector.ts

The reflector consolidates accumulated observation notes into long-term memory traces. Activates when note tokens exceed threshold (default: 40,000 tokens).

Pipeline:

Merge redundant observations
Elevate important facts to long-term traces
Detect conflicts against existing memories
Resolve conflicts based on personality:
- High honesty: prefer newer information, supersede old
- High agreeableness: keep both versions, note discrepancy
- Default: prefer higher confidence

Target compression: 5-40x (many observations -> few high-quality traces).

Personality also controls memory style:

High conscientiousness: structured, well-organized traces
High openness: rich, associative traces with connections
Default: concise, factual traces

Prospective Memory

Source: src/memory/retrieval/prospective/ProspectiveMemoryManager.ts

Prospective memory handles future intentions — "remember to do X when Y happens."

Trigger Types

Type	Fires When	Example
`time_based`	Current time >= `triggerAt`	"Remind me at 3pm"
`event_based`	Named event in `context.events`	"When user mentions deployment"
`context_based`	Query embedding similarity > threshold	"When we discuss pricing"

Registration

await manager.register({
  content: 'Remind user about the PR review',
  triggerType: 'time_based',
  triggerAt: Date.now() + 3_600_000, // 1 hour
  importance: 0.8,
  recurring: false,
});

Checking

Checked each turn before prompt construction. Triggered items are injected into the "Reminders" section of the assembled memory context. Items can be recurring (re-trigger) or one-shot (marked triggered after firing).

Context-based triggers use cosine similarity between the cue embedding and the current query embedding, with a configurable similarityThreshold (default 0.7).

Consolidation Pipeline

Source: src/memory/pipeline/consolidation/ConsolidationPipeline.ts

Runs periodically (default: every hour) to maintain memory health. Five steps:

Step 1: Decay Sweep

Apply Ebbinghaus curve to all traces, soft-delete those below pruningThreshold (default 0.05). Emotional memories (intensity > 0.3) are protected.

Step 2: Co-Activation Replay

Process recent traces (last 24 hours) to create graph edges:

SHARED_ENTITY: Traces mentioning the same entity get connected (weight 0.5)
TEMPORAL_SEQUENCE: Traces created within 5 minutes get connected (weight 0.3)

Step 3: Schema Integration

Use detectClusters() on the memory graph (minimum cluster size: 5). For each cluster, invoke an LLM to summarize member traces into a single semantic knowledge node. Connect via SCHEMA_INSTANCE edges.

Step 4: Conflict Resolution

Scan CONTRADICTS edges and resolve based on personality:

High honesty (>0.6): Prefer newer information, soft-delete the older trace
Default: Prefer higher confidence (only if confidence difference >0.2)

Step 5: Spaced Repetition

Find traces past their nextReinforcementAt timestamp and boost them via recordAccess(), which increases stability and doubles the reinforcement interval.

Result

interface ConsolidationResult {
  prunedCount: number;        // Traces soft-deleted
  edgesCreated: number;       // Graph edges created
  schemasCreated: number;     // Semantic schemas from clusters
  conflictsResolved: number;  // Contradictions resolved
  reinforcedCount: number;    // Traces reinforced
  totalProcessed: number;     // Total traces examined
  durationMs: number;         // Cycle duration
}

Prompt Assembly

Source: src/memory/core/prompt/MemoryPromptAssembler.ts

Assembles memory context into a single formatted string within a token budget, split across six sections with overflow redistribution.

Default Budget Allocation

Section	Budget %	Content
Working Memory	15%	Active context from slot buffer
Semantic Recall	45%	Retrieved semantic/procedural traces
Recent Episodic	25%	Retrieved episodic traces
Prospective Alerts	5%	Triggered reminders (Batch 2)
Graph Associations	5%	Spreading activation context (Batch 2)
Observation Notes	5%	Recent observer notes (Batch 2)

Overflow Redistribution

If a section uses less than its budget, the overflow flows to Semantic Recall. If Batch 2 sections are empty (no observer, no graph, no prospective items), their budgets are also redistributed to Semantic Recall.

Personality -> Formatting Style

The assembler selects a formatting style based on the dominant HEXACO trait:

Dominant Trait	Style	Output
Conscientiousness	`structured`	Bullet points, categories
Openness	`narrative`	Flowing prose, connections
Emotionality	`emotional`	Emphasis on feelings, tone

Output Sections

## Active Context
- [ACTIVE] mt_1234 (activation: 0.85)

## Relevant Memories
- [semantic, score=0.82] User prefers TypeScript...

## Recent Experiences
- [episodic, score=0.71] Discussed deployment on Tuesday...

## Reminders
- [time_based] PR review is due

## Related Context
- [associated, activation=0.45] Related discussion about CI/CD...

## Observations
- User tends to ask follow-up questions about error handling

Token estimation uses ~4 chars per token heuristic.

Configuration

CognitiveMemoryConfig (Top-Level)

interface CognitiveMemoryConfig {
  // --- Required dependencies ---
  workingMemory: IWorkingMemory;      // Existing AgentOS working memory
  knowledgeGraph: IKnowledgeGraph;    // Existing AgentOS knowledge graph
  vectorStore: IVectorStore;          // Vector store for embeddings
  embeddingManager: IEmbeddingManager; // Embedding generation

  // --- Agent identity ---
  agentId: string;
  traits: HexacoTraits;              // { honesty, emotionality, extraversion, agreeableness, conscientiousness, openness }
  moodProvider: () => PADState;      // Callback to get current mood

  // --- Feature detection ---
  featureDetectionStrategy: 'keyword' | 'llm' | 'hybrid'; // Default: 'keyword'
  featureDetectionLlmInvoker?: (systemPrompt: string, userPrompt: string) => Promise<string>;

  // --- Tuning ---
  encoding?: Partial<EncodingConfig>;        // See defaults below
  decay?: Partial<DecayConfig>;              // See defaults below
  workingMemoryCapacity?: number;            // Default: 7 (Miller's number)
  tokenBudget?: Partial<MemoryBudgetAllocation>;
  collectionPrefix?: string;                 // Default: 'cogmem'

  // --- Batch 2 (optional, no-op when absent) ---
  observer?: Partial<ObserverConfig>;
  reflector?: Partial<ReflectorConfig>;
  graph?: Partial<MemoryGraphConfig>;
  consolidation?: Partial<ConsolidationConfig>;
}

Encoding Defaults

Parameter	Default	Description
`baseStrength`	0.5	Base encoding strength before modulation
`flashbulbThreshold`	0.8	Emotional intensity threshold for flashbulb
`flashbulbStrengthMultiplier`	2.0	Strength boost for flashbulb memories
`flashbulbStabilityMultiplier`	5.0	Stability boost for flashbulb memories
`baseStabilityMs`	3,600,000	Base stability (1 hour)

Decay Defaults

Parameter	Default	Description
`pruningThreshold`	0.05	Strength below which traces are pruned
`recencyHalfLifeMs`	86,400,000	Recency boost half-life (24 hours)
`interferenceThreshold`	0.7	Cosine similarity threshold for interference

Graph Defaults

Parameter	Default	Description
`backend`	`'knowledge-graph'`	Graph backend selection
`maxDepth`	3	Spreading activation max hops
`decayPerHop`	0.5	Activation decay per hop
`activationThreshold`	0.1	Minimum activation to continue
`hebbianLearningRate`	0.1	Co-activation edge strengthening rate

Consolidation Defaults

Parameter	Default	Description
`intervalMs`	3,600,000	Run interval (1 hour)
`maxTracesPerCycle`	500	Max traces per cycle
`mergeSimilarityThreshold`	0.92	Similarity threshold for merging
`minClusterSize`	5	Min cluster size for schema integration

Quick Start

Minimal setup with core features (no LLM calls, no Batch 2):

import { CognitiveMemoryManager } from '@framers/agentos/memory';

const memory = new CognitiveMemoryManager();

await memory.initialize({
  workingMemory: existingWorkingMemory,
  knowledgeGraph: existingKnowledgeGraph,
  vectorStore: existingVectorStore,
  embeddingManager: existingEmbeddingManager,
  agentId: 'my-agent',
  traits: { openness: 0.7, conscientiousness: 0.8, emotionality: 0.5 },
  moodProvider: () => ({ valence: 0, arousal: 0.3, dominance: 0 }),
  featureDetectionStrategy: 'keyword',
});

// Encode a user message
const mood = { valence: 0.2, arousal: 0.4, dominance: 0 };
const trace = await memory.encode(
  'I prefer deploying with Docker Compose',
  mood,
  'content',
  { type: 'semantic', scope: 'user', tags: ['deployment', 'docker'] },
);

// Retrieve relevant memories before prompt construction
const result = await memory.retrieve('How should I deploy?', mood, { topK: 5 });

// Assemble for prompt injection (1000 token budget)
const context = await memory.assembleForPrompt('How should I deploy?', 1000, mood);
console.log(context.contextText);    // Formatted memory context
console.log(context.tokensUsed);     // Actual tokens used

Full setup with all Batch 2 features:

const llmInvoker = async (system: string, user: string) => {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [{ role: 'system', content: system }, { role: 'user', content: user }],
  });
  return response.choices[0].message.content ?? '';
};

await memory.initialize({
  // ... core config as above ...
  observer: { activationThresholdTokens: 30_000, llmInvoker },
  reflector: { activationThresholdTokens: 40_000, llmInvoker },
  graph: { backend: 'knowledge-graph', maxDepth: 3, decayPerHop: 0.5 },
  consolidation: { intervalMs: 3_600_000, minClusterSize: 5 },
});

// Observer: feed each message
await memory.observe('user', 'I need to deploy by Friday', mood);
await memory.observe('assistant', 'I can help with that deployment.', mood);

// Prospective: register a reminder
const pm = memory.getProspective();
await pm.register({
  content: 'User needs deployment done by Friday',
  triggerType: 'time_based',
  triggerAt: fridayTimestamp,
  importance: 0.9,
  recurring: false,
});

// Consolidation runs automatically on timer, or manually:
const result = await memory.runConsolidation();
console.log(`Pruned ${result.prunedCount}, created ${result.schemasCreated} schemas`);

Integration with GMI

The Cognitive Memory System integrates into the GMI turn loop at three points:

After User Message (Encode)

// In the GMI turn handler, after receiving user input:
const mood = moodEngine.getCurrentState();
await cognitiveMemory.encode(userMessage, mood, gmiMood, {
  type: 'episodic',
  scope: 'user',
  scopeId: userId,
  sourceType: 'user_statement',
});

Before Prompt Construction (Retrieve + Assemble)

// Before building the system prompt:
const memoryContext = await cognitiveMemory.assembleForPrompt(
  userMessage,
  tokenBudget,
  mood,
);
// Inject memoryContext.contextText into the prompt via PromptBuilder

After Response (Observe)

// After the LLM generates a response:
await cognitiveMemory.observe('assistant', assistantResponse, mood);

// Also feed user messages to observer for conversation monitoring:
await cognitiveMemory.observe('user', userMessage, mood);

Comparison with Mastra

The Cognitive Memory System addresses 12 limitations in Mastra's memory architecture:

#	Mastra Limitation	AgentOS Improvement
1	Flat strength (all memories equal)	HEXACO-modulated encoding strength with Yerkes-Dodson arousal curve
2	No forgetting	Ebbinghaus exponential decay with configurable stability
3	No spaced repetition	Desirable difficulty effect with interval doubling
4	No working memory limits	Baddeley's model with personality-modulated capacity (5-9 slots)
5	No emotional context	PAD model snapshot at encoding, mood-congruent retrieval bias
6	Single retrieval signal (similarity)	6-signal composite scoring (strength, similarity, recency, emotion, graph, importance)
7	No memory graph	IMemoryGraph with 8 edge types and spreading activation
8	No interference modeling	Proactive and retroactive interference with configurable thresholds
9	No consolidation	5-step pipeline: decay sweep, replay, schema integration, conflict resolution, reinforcement
10	No prospective memory	Time, event, and context-based triggers with recurring support
11	No observer/reflector	Personality-biased observation + LLM-driven consolidation into traces
12	No provenance tracking	Full source monitoring with confidence, verification count, and contradiction detection

Source Files

All source lives in packages/agentos/src/memory/:

File	Export
`types.ts`	All types: `MemoryTrace`, `MemoryType`, `MemoryScope`, `ScoredMemoryTrace`, etc.
`config.ts`	`CognitiveMemoryConfig`, `EncodingConfig`, `DecayConfig`, defaults
`CognitiveMemoryManager.ts`	`CognitiveMemoryManager`, `ICognitiveMemoryManager`
`encoding/EncodingModel.ts`	`computeEncodingStrength`, `yerksDodson`, `buildEmotionalContext`
`encoding/ContentFeatureDetector.ts`	`createFeatureDetector`, `IContentFeatureDetector`
`decay/DecayModel.ts`	`computeCurrentStrength`, `updateOnRetrieval`, `computeInterference`
`decay/RetrievalPriorityScorer.ts`	`scoreAndRankTraces`, `detectPartiallyRetrieved`
`working/CognitiveWorkingMemory.ts`	`CognitiveWorkingMemory`
`store/MemoryStore.ts`	`MemoryStore`
`prompt/MemoryPromptAssembler.ts`	`assembleMemoryContext`
`prompt/MemoryFormatters.ts`	`formatMemoryTrace`, `FormattingStyle`
`graph/IMemoryGraph.ts`	`IMemoryGraph`, `MemoryEdgeType`, `ActivatedNode`
`graph/SpreadingActivation.ts`	`spreadActivation`
`graph/GraphologyMemoryGraph.ts`	`GraphologyMemoryGraph`
`graph/KnowledgeGraphMemoryGraph.ts`	`KnowledgeGraphMemoryGraph`
`observation/MemoryObserver.ts`	`MemoryObserver`, `ObservationNote`
`observation/MemoryReflector.ts`	`MemoryReflector`, `MemoryReflectionResult`
`observation/ObservationBuffer.ts`	`ObservationBuffer`
`prospective/ProspectiveMemoryManager.ts`	`ProspectiveMemoryManager`, `ProspectiveMemoryItem`
`consolidation/ConsolidationPipeline.ts`	`ConsolidationPipeline`, `ConsolidationResult`

Relationship to Persistent Working Memory

AgentOS provides two complementary working memory systems:

	Baddeley Cognitive Working Memory	Persistent Markdown Working Memory
Purpose	In-session attention modeling	Cross-session user context
Lifespan	Single session (in-memory)	Persists on disk (~/.agentos/agents/{id}/working-memory.md)
Updates	Automatic activation decay	Agent calls `update_working_memory` tool
Format	Capacity-limited slots (7±2)	Free-form markdown template
Budget	15% of prompt tokens	5% of prompt tokens

Both are injected into the system prompt simultaneously. The persistent memory appears as ## Persistent Memory before the cognitive slots. See Persistent Working Memory for details.

References

The runtime constants, formulas, weights, and design decisions in this page are grounded in the cognitive-science and information-retrieval literature listed below. Citations are inline throughout the doc; this section consolidates them for review and audit.

Cognitive science foundations

Atkinson, R. C., & Shiffrin, R. M. (1968). Human memory: A proposed system and its control processes. In K. W. Spence & J. T. Spence (Eds.), The psychology of learning and motivation (Vol. 2, pp. 89–195). Academic Press. — Multi-store memory model. Wikipedia summary
Baddeley, A. D., & Hitch, G. (1974). Working memory. In G. H. Bower (Ed.), The psychology of learning and motivation (Vol. 8, pp. 47–89). Academic Press. — Working memory model with slot-based capacity. ScienceDirect
Baddeley, A. D. (2003). Working memory: Looking back and looking forward. Nature Reviews Neuroscience, 4(10), 829–839. — Updated synthesis. DOI
Tulving, E. (1972). Episodic and semantic memory. In E. Tulving & W. Donaldson (Eds.), Organization of memory (pp. 381–403). Academic Press. — LTM taxonomy (episodic / semantic / procedural). APA PsycNet
Ebbinghaus, H. (1885). Über das Gedächtnis: Untersuchungen zur experimentellen Psychologie (English: Memory: A Contribution to Experimental Psychology, 1913 trans. Ruger & Bussenius). Duncker & Humblot. — The original forgetting curve S(t) = S₀ · e^(-Δt / stability). Project Gutenberg (1913 trans.)
Yerkes, R. M., & Dodson, J. D. (1908). The relation of strength of stimulus to rapidity of habit-formation. Journal of Comparative Neurology and Psychology, 18(5), 459–482. — Inverted-U arousal curve. Wiley
Brown, R., & Kulik, J. (1977). Flashbulb memories. Cognition, 5(1), 73–99. — Flashbulb memory phenomenon. APA PsycNet
Bower, G. H. (1981). Mood and memory. American Psychologist, 36(2), 129–148. — Mood-congruent encoding. APA DOI
Anderson, J. R. (1983). A spreading activation theory of memory. Journal of Verbal Learning and Verbal Behavior, 22(3), 261–295. — ACT-R spreading activation. APA PsycNet · ACT-R home
Hebb, D. O. (1949). The Organization of Behavior: A Neuropsychological Theory. Wiley. — "Cells that fire together, wire together." Wikipedia summary
Johnson, M. K., Hashtroudi, S., & Lindsay, D. S. (1993). Source monitoring. Psychological Bulletin, 114(1), 3–28. — Source-monitoring framework underpinning the per-source decay multipliers. APA PsycNet

Personality structure

Ashton, M. C., & Lee, K. (2007). Empirical, theoretical, and practical advantages of the HEXACO model of personality structure. Personality and Social Psychology Review, 11(2), 150–166. — HEXACO six-factor model. SAGE Journals

Retrieval-augmented generation

Gao, L., Ma, X., Lin, J., & Callan, J. (2022). Precise zero-shot dense retrieval without relevance labels. arXiv preprint. — HyDE retrieval. arXiv:2212.10496
Edge, D., Trinh, H., Cheng, N., Bradley, J., Chao, A., Mody, A., Truitt, S., & Larson, J. (2024). From local to global: A graph RAG approach to query-focused summarization. arXiv preprint. — Microsoft GraphRAG. arXiv:2404.16130

Cognitive architectures for language agents

Park, J. S., O'Brien, J. C., Cai, C. J., Morris, M. R., Liang, P., & Bernstein, M. S. (2023). Generative agents: Interactive simulacra of human behavior. arXiv preprint. — Smallville generative agents — the canonical "persona + memory + reflection" demo. arXiv:2304.03442
Sumers, T. R., Yao, S., Narasimhan, K., & Griffiths, T. L. (2023). Cognitive architectures for language agents. arXiv preprint. — CoALA framework that AgentOS's memory taxonomy follows. arXiv:2309.02427

Benchmarks

Wu, D., Wang, J., Hu, P., et al. (2024). LongMemEval: Benchmarking chat assistants on long-term interactive memory. ICLR 2025. — The benchmark agentos-bench reports against. arXiv:2410.10813

Implementation references

Source files cited inline:

packages/agentos/src/memory/CognitiveMemoryManager.ts — top-level orchestrator
packages/agentos/src/memory/core/decay/DecayModel.ts — Ebbinghaus formula + spaced repetition
packages/agentos/src/memory/mechanisms/defaults.ts — eight cognitive mechanism defaults
packages/agentos/src/memory/retrieval/hyde/MemoryHydeRetriever.ts — HyDE retriever
packages/agentos/src/memory/retrieval/graph/graphrag/GraphRAGEngine.ts — GraphRAG implementation

Why memory should forget​

Overview​

Cognitive science foundations​

Architecture​

Memory Types​

Memory Scopes​

The MemoryTrace Envelope​

Encoding Model​

1. HEXACO Personality -> Encoding Weights​

2. Yerkes-Dodson Arousal Curve​

3. Mood-Congruent Encoding​

4. Flashbulb Memories​

Composite Encoding Strength​

Content Feature Detection​

Forgetting & Decay​

Ebbinghaus Forgetting Curve​

Spaced Repetition​

Interference​

Pruning​

Retrieval Priority Scoring​

Tip-of-the-Tongue Detection​

Working Memory (Baddeley's Model)​

Capacity​

Slot Mechanics​

Activation Lifecycle​

Prompt Formatting​

Memory Store​

Collection Naming​

Memory Graph​

Edge Types​

Spreading Activation​

Algorithm (BFS)​

Configuration​

Hebbian Learning​

Observer/Reflector System​

Memory Observer​

Memory Reflector​

Prospective Memory​

Trigger Types​

Registration​

Checking​

Consolidation Pipeline​

Step 1: Decay Sweep​

Step 2: Co-Activation Replay​

Step 3: Schema Integration​

Step 4: Conflict Resolution​

Step 5: Spaced Repetition​

Result​

Prompt Assembly​

Default Budget Allocation​

Overflow Redistribution​

Personality -> Formatting Style​

Output Sections​

Configuration​

CognitiveMemoryConfig (Top-Level)​

Encoding Defaults​

Decay Defaults​

Graph Defaults​

Consolidation Defaults​

Quick Start​

Integration with GMI​

After User Message (Encode)​

Before Prompt Construction (Retrieve + Assemble)​

After Response (Observe)​

Comparison with Mastra​

Source Files​

Relationship to Persistent Working Memory​

References​

Cognitive science foundations​

Personality structure​

Retrieval-augmented generation​

Cognitive architectures for language agents​

Benchmarks​

Implementation references​

Why memory should forget

Overview

Cognitive science foundations

Architecture

Memory Types

Memory Scopes

The MemoryTrace Envelope

Encoding Model

1. HEXACO Personality -> Encoding Weights

2. Yerkes-Dodson Arousal Curve

3. Mood-Congruent Encoding

4. Flashbulb Memories

Composite Encoding Strength

Content Feature Detection

Forgetting & Decay

Ebbinghaus Forgetting Curve

Spaced Repetition

Interference

Pruning

Retrieval Priority Scoring

Tip-of-the-Tongue Detection

Working Memory (Baddeley's Model)

Capacity

Slot Mechanics

Activation Lifecycle

Prompt Formatting

Memory Store

Collection Naming

Memory Graph

Edge Types

Spreading Activation

Algorithm (BFS)

Configuration

Hebbian Learning

Observer/Reflector System

Memory Observer

Memory Reflector

Prospective Memory

Trigger Types

Registration

Checking

Consolidation Pipeline

Step 1: Decay Sweep

Step 2: Co-Activation Replay

Step 3: Schema Integration

Step 4: Conflict Resolution

Step 5: Spaced Repetition

Result

Prompt Assembly

Default Budget Allocation

Overflow Redistribution

Personality -> Formatting Style

Output Sections

Configuration

CognitiveMemoryConfig (Top-Level)

Encoding Defaults

Decay Defaults

Graph Defaults

Consolidation Defaults

Quick Start

Integration with GMI

After User Message (Encode)

Before Prompt Construction (Retrieve + Assemble)

After Response (Observe)

Comparison with Mastra

Source Files

Relationship to Persistent Working Memory

References

Cognitive science foundations

Personality structure

Retrieval-augmented generation

Cognitive architectures for language agents

Benchmarks

Implementation references