Skip to main content

Cognitive Memory

Memory benchmarks (full N=500, gpt-4o reader): 85.6% on LongMemEval-S at $0.0090 per correct, +1.4 points above Mastra Observational Memory (84.23%). 70.2% on LongMemEval-M on the 1.5M-token / 500-session haystack variant — the only open-source library on the public record above 65% on M with publicly reproducible methodology. Competitive with the strongest published M results in the LongMemEval paper (Wu et al., ICLR 2025: round Top-5 65.7%, session Top-5 71.4%, round Top-10 72.0%). Benchmarks · Run JSONs · SOTA writeup

See also

For the practical guide with usage examples and configuration, see Cognitive Memory Guide.


Why memory should forget

"It's a poor sort of memory that only works backwards."

Through the Looking-Glass, Lewis Carroll, 1871

Most agent memory systems forget nothing. They embed every message, store the vectors, and at retrieval time return whatever is closest in cosine space. This works for a few thousand turns. Past that, the system is doing something a human mind explicitly evolved not to do: treating every recorded experience as equally available, equally trustworthy, and equally relevant. The literature on biological memory is a long argument that forgetting is not a bug. It is the mechanism by which what mattered yesterday continues to matter today.

The cognitive memory system in AgentOS is built on that argument. Encoding strength is set per-trace, modulated by the personality traits of the agent doing the encoding and by the emotional intensity of the moment (Brown & Kulik, 1977 on flashbulb memories; Yerkes & Dodson, 1908 on the inverted-U arousal curve). Strength then decays exponentially with time on Hermann Ebbinghaus's 1885 forgetting curve S(t) = S₀ · e^(-Δt / stability), accelerated by interference from new similar memories and slowed by successful retrieval (the desirable-difficulty effect — harder retrievals grow stability more). Working memory is bounded by Baddeley's slot model of seven-plus-or-minus-two, modulated by traits. Retrieval composites six signals — vector similarity, current strength, recency, emotional congruence with the agent's mood, graph spreading-activation in the ACT-R tradition (Anderson, 1983), and importance. The graph itself learns: co-retrieval of two traces tightens the edge between them via Hebbian weight updates ("neurons that fire together wire together").

The result is a memory that behaves more like a person remembering. The agent forgets the irrelevant. It holds onto what hit it hard. It pulls the thing that's adjacent in concept-space, not just the thing that's adjacent in vector-space. And — because every mechanism is HEXACO-modulated — the same input encodes differently depending on who is doing the remembering.

Eight cognitive mechanisms layered on top

On top of the encoding/decay/retrieval substrate, the runtime ships eight optional neuroscience-grounded mechanisms — reconsolidation, retrieval-induced forgetting, involuntary recall, metacognitive feeling-of-knowing, temporal gist, schema encoding, source-confidence decay, and emotion regulation. All HEXACO-personality-modulated and individually configurable via cognitiveMechanisms on CognitiveMemoryConfig. See the Cognitive Mechanisms Implementation Guide for hook points, APIs, and testing.

Overview

The Cognitive Memory System models memory as a dynamic, personality-modulated process rather than a flat key-value store:

  • Encoding is shaped by the agent's HEXACO personality traits and current emotional state (PAD model: valence, arousal, dominance)
  • Forgetting follows the Ebbinghaus exponential decay curve, with retrieval-induced reinforcement via spaced repetition
  • Retrieval combines six weighted signals (strength, embedding similarity, recency, emotional congruence, graph activation, importance) into a composite score
  • Working memory enforces Baddeley's slot-based capacity limits (7±2), modulated by traits
  • Consolidation runs periodically to prune weak traces, merge clusters into schemas, resolve contradictions, and feed observations back into long-term storage

The system is composable. Core encoding/decay/retrieval (Batch 1) runs without any LLM calls. Advanced features (Batch 2 — observer, reflector, graph, consolidation) activate automatically when their config is provided and degrade gracefully when absent. You can run the entire stack against a local SQLite + HNSW backend, or scale it to Postgres + Neo4j without changing any callsite.

Cognitive science foundations

Each model below has a one-to-one analogue in the source. The point of the table is not to claim the runtime "uses" these papers in the loose sense — the point is that the constants, formulas, and weights you'll see in the code lines below come straight from this literature.

ModelReferenceApplication in AgentOS
Multi-store memoryAtkinson & Shiffrin, 1968Sensory input → working memory → long-term memory pipeline
Working memory modelBaddeley & Hitch, 1974; Baddeley 2003Slot-based capacity limits (7±2) with activation levels
LTM taxonomyTulving, 1972Episodic / semantic / procedural / prospective memory types
Forgetting curveEbbinghaus, 1885S(t) = S₀ · e^(-Δt / stability) exponential decay
Arousal curveYerkes & Dodson, 1908Encoding quality peaks at moderate arousal (inverted-U)
Flashbulb memoriesBrown & Kulik, 1977High-emotion events create vivid, persistent traces
Mood-congruent encodingBower, 1981Content matching current mood valence encodes more strongly
Spreading activationAnderson, 1983 (ACT-R)BFS through associative graph with activation decay
Hebbian learningHebb, 1949Co-retrieval strengthens graph edges
HEXACO personalityAshton & Lee, 2007Trait-driven encoding weights and memory capacity modulation
Source-monitoring frameworkJohnson, Hashtroudi & Lindsay, 1993Different memory sources decay at different rates (provenance-aware)
HyDE retrievalGao et al., 2022Generate hypothetical answer, embed that, search for matches
GraphRAGMicrosoft Research, 2024Entity-graph + community summaries for multi-hop retrieval
Generative agentsPark et al., 2023Persona + memory + reflection as the long-running agent pattern
CoALA frameworkSumers et al., 2023Cognitive architectures for language agents — episodic / semantic / procedural memory typology

Architecture

┌─────────────────────────────────────────────────────────────────────┐
│ CognitiveMemoryManager │
│ (top-level orchestrator) │
└──┬──────┬──────┬──────┬──────┬──────┬──────┬──────┬───────────────┘
│ │ │ │ │ │ │ │
▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼
┌──────┐┌──────┐┌──────┐┌──────┐┌──────┐┌──────┐┌──────┐┌──────────┐
│Encod-││Decay ││Work- ││Memory││Prompt││Memory││Obser-││Consolida-│
│ ing ││Model ││ ing ││Store ││Assem-││Graph ││ver / ││tion Pipe-│
│Model ││ ││Memory││ ││bler ││ ││Reflec││line │
│ ││ ││ ││ ││ ││ ││tor ││ │
└──┬───┘└──┬───┘└──┬───┘└──┬───┘└──┬───┘└──┬───┘└──┬───┘└──┬───────┘
│ │ │ │ │ │ │ │
▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼
┌──────┐┌──────┐┌──────┐┌──────────────┐┌──────┐┌──────┐┌──────────┐
│HEXACO││Ebbng-││Badde-││ IVectorStore││Spread││LLM ││Prospec- │
│Traits││haus ││ley ││ + IKnowledge││ing ││Invok-││tive Mem- │
│+ PAD ││Curve ││Slots ││ Graph ││Activ.││er ││ory Mgr │
└──────┘└──────┘└──────┘└──────────────┘└──────┘└──────┘└──────────┘

Per-turn data flow (GMI integration):

User Message arrives
1. encode() — Create MemoryTrace from input (personality-modulated strength)
2. retrieve() — Query vector store + score with 6-signal composite
3. assembleForPrompt — Token-budgeted context assembly → inject into system prompt
4. [LLM generates response]
5. observe() — Feed response to observer buffer (Batch 2)
6. checkProspective — Check time/event/context triggers (Batch 2)
7. runConsolidation — Periodic background sweep (Batch 2, timer-based)

Memory Types

Based on Tulving's long-term memory taxonomy with extensions:

TypeCognitive ModelAgentOS UsageExample
episodicAutobiographical eventsConversation events, interactions"User asked about deployment on Tuesday"
semanticGeneral knowledge/factsLearned facts, preferences, schemas"User prefers TypeScript over Python"
proceduralSkills and how-toWorkflows, tool usage patterns"To deploy, run the deployment pipeline"
prospectiveFuture intentionsGoals, reminders, planned actions"Remind user about the PR review"

Memory Scopes

Each trace is scoped to control visibility and ownership:

ScopeVisibilityPersistenceUse Case
threadSingle conversationConversation lifetimeIn-conversation working context
userAll conversations with a userLong-termUser preferences, facts, history
personaAll users of a personaLong-termPersona's learned knowledge
organizationAll agents in an orgLong-termShared organizational knowledge

Collections in the vector store are named {prefix}_{scope}_{scopeId} (default prefix: cogmem).


The MemoryTrace Envelope

Every memory is wrapped in a MemoryTrace — the universal envelope carrying content, provenance, emotional context, and decay parameters:

Field GroupKey FieldsPurpose
Identityid, type, scope, scopeIdClassification and routing
Contentcontent, structuredData, entities, tagsThe actual memory data
ProvenancesourceType, sourceId, confidence, verificationCount, contradictedBySource monitoring to prevent confabulation
Emotional Contextvalence, arousal, dominance, intensity, gmiMoodPAD snapshot at encoding time
Decay ParametersencodingStrength (S0), stability (tau), retrievalCount, lastAccessedAtEbbinghaus curve inputs
Spaced RepetitionreinforcementInterval, nextReinforcementAtInterval doubling schedule
GraphassociatedTraceIdsLinks to related traces
LifecyclecreatedAt, updatedAt, consolidatedAt, isActiveTimestamps and soft-delete flag

Source types: user_statement, agent_inference, tool_result, observation, reflection, external.


Encoding Model

Source: src/memory/core/encoding/EncodingModel.ts

Encoding determines how strongly a new input is committed to memory. The system combines four cognitive mechanisms:

1. HEXACO Personality -> Encoding Weights

Each HEXACO trait modulates attention to specific content features:

TraitAttention WeightFormulaEffect
OpennessnoveltyAttention0.3 + O * 0.7High O notices novel, creative content
ConscientiousnessproceduralAttention0.3 + C * 0.7High C notices procedures, structure
EmotionalityemotionalSensitivity0.2 + E * 0.8High E amplifies emotional content
ExtraversionsocialAttention0.2 + X * 0.8High X notices social dynamics
AgreeablenesscooperativeAttention0.2 + A * 0.8High A notices cooperation cues
HonestyethicalAttention0.2 + H * 0.8High H notices ethical/moral content

The composite attention multiplier starts at 0.5 and adds weighted bonuses for each detected content feature (0.10-0.15 each), plus a base 0.15 for contradictions and topic relevance.

2. Yerkes-Dodson Arousal Curve

Encoding quality peaks at moderate arousal (inverted U):

f(a) = 1 - 4 * (a - 0.5)^2

where a = arousal normalised to [0, 1]

Returns a multiplier in [0.3, 1.0], peaking at a = 0.5. Very low arousal (bored) and very high arousal (panicked) both impair encoding.

3. Mood-Congruent Encoding

Content whose emotional valence matches the current mood is encoded more strongly:

boost = 1 + max(0, currentValence * contentValence) * emotionalSensitivity * 0.3

Positive product means mood and content are congruent (both positive or both negative).

4. Flashbulb Memories

When emotional intensity exceeds the threshold (default: 0.8), the memory becomes a flashbulb memory:

  • Strength multiplier: 2.0x (default)
  • Stability multiplier: 5.0x (default)

These model the vivid, persistent memories formed during highly emotional events (Brown & Kulik, 1977).

Composite Encoding Strength

S₀ = min(1.0, base * arousalBoost * emotionalBoost * attentionMultiplier * congruenceBoost * flashbulbBoost)

Default base = 0.5. The stability (time constant for decay) is computed as:

stability = baseStabilityMs * (1 + S₀ * 6) * flashbulbStabilityMultiplier

Default baseStabilityMs = 3,600,000 (1 hour). Stronger memories are inherently more stable.


Content Feature Detection

The encoding model needs to know what features the content contains. Three detection strategies are available:

StrategySpeedQualityLLM CallsBest For
keywordFastModerate0Default; low-latency agents
llmSlowHigh1 per encodeHigh-fidelity agents with budget
hybridMediumHighPeriodicBest balance; keyword first, LLM re-classification during consolidation

Detected features (ContentFeatures): hasNovelty, hasProcedure, hasEmotion, hasSocialContent, hasCooperation, hasEthicalContent, hasContradiction, topicRelevance.

Configure via featureDetectionStrategy in CognitiveMemoryConfig.


Forgetting & Decay

Source: src/memory/core/decay/DecayModel.ts

Ebbinghaus Forgetting Curve

Memory strength decays exponentially over time:

S(t) = S₀ * e^(-dt / stability)

where:
S₀ = initial encoding strength
dt = time elapsed since last access (ms)
stability = time constant (ms); grows with each retrieval

Spaced Repetition

Each successful retrieval updates the trace via the desirable difficulty effect:

  • Difficulty bonus: max(0.1, 1 - currentStrength) — weaker memories get larger stability boosts
  • Diminishing returns: 1 / (1 + 0.1 * retrievalCount) — logarithmic saturation
  • Emotional bonus: 1 + intensity * 0.3 — emotional memories consolidate faster
  • Growth factor: (1.5 + difficultyBonus * 2.0) * diminish * emotionalBonus
  • Interval doubling: reinforcementInterval *= 2 after each retrieval

Interference

When a new trace overlaps with existing traces (cosine similarity > threshold, default 0.7):

  • Retroactive interference: New trace weakens old similar traces (strength reduction ~0.15 at similarity 1.0)
  • Proactive interference: Old traces impair new encoding (capped at 0.3 total reduction)

Pruning

Traces with currentStrength < pruningThreshold (default: 0.05) are soft-deleted during consolidation, unless their emotional intensity exceeds 0.3 (emotional memories are protected from pruning).

Lifecycle note: these retention/decay sweeps are now operational on the built-in vector stores that implement scanByMetadata(). Adapters without metadata-scan support still need provider-specific work before they can participate fully in lifecycle enforcement.


Retrieval Priority Scoring

Source: src/memory/core/decay/RetrievalPriorityScorer.ts

Retrieval combines six signals into a composite score:

SignalWeightRangeComputation
strength0.250-1S₀ * e^(-dt / stability)
similarity0.350-1Cosine similarity from vector search
recency0.100-1(e^(-elapsed / halfLife)) / 0.2 (normalised)
emotionalCongruence0.150-1max(0, moodValence * traceValence) / 0.25 (normalised)
graphActivation0.100-1Spreading activation score (0 without graph)
importance0.050-1confidence * 0.5 + 0.5

Composite score:

score = clamp(0, 1,
w_str * strengthScore +
w_sim * similarityScore +
w_rec * recencyNorm +
w_emo * emotionalNorm +
w_graph * graphActivation +
w_imp * importanceScore
)

Setting neutralMood: true in retrieval options disables emotional congruence bias (useful for factual lookups).

Tip-of-the-Tongue Detection

Traces with high vector similarity (>0.6) but low strength (<0.3) or low confidence (<0.4) are returned as PartiallyRetrievedTrace — the agent "almost" remembers them. These include suggestedCues (tags) to help the user provide more context.


Working Memory (Baddeley's Model)

Source: src/memory/core/working/CognitiveWorkingMemory.ts

Working memory is a slot-based, capacity-limited buffer that tracks what the agent is currently "thinking about."

Capacity

Base capacity follows Miller's number (7), modulated by personality:

  • High openness (>0.6): +1 slot (broader attention span)
  • High conscientiousness (>0.6): -1 slot (deeper focus per item)
  • Result clamped to [5, 9] (Miller's 7 plus/minus 2)

Slot Mechanics

Each WorkingMemorySlot tracks:

FieldRangePurpose
activationLevel0-1How "in focus" this item is
attentionWeight0-1Proportional share of attention (normalised)
rehearsalCount0+Maintenance rehearsal bumps (+0.15 per rehearse)
enteredAtUnix msWhen the trace entered working memory

Activation Lifecycle

  1. Focus: New trace enters at initialActivation (default 0.8). If at capacity, lowest-activation slot is evicted first.
  2. Rehearsal: rehearse(slotId) bumps activation by 0.15 (capped at 1.0).
  3. Decay: Each turn, all activations decrease by activationDecayRate (default 0.1).
  4. Eviction: Slots below minActivation (default 0.15) are evicted. The onEvict callback can encode evicted items back to long-term memory.

Prompt Formatting

formatForPrompt() outputs slots sorted by activation:

- [ACTIVE] mt_1234 (activation: 0.85)
- [fading] mt_1235 (activation: 0.52)
- [weak] mt_1236 (activation: 0.20)

Memory Store

Source: src/memory/retrieval/store/MemoryStore.ts

The MemoryStore wraps IVectorStore + IKnowledgeGraph into a unified persistence layer:

  • Store: Embeds content via IEmbeddingManager, upserts into vector store, records as episodic memory in knowledge graph
  • Query: Vector search -> decay-aware scoring -> tip-of-the-tongue detection
  • Access tracking: Updates spaced repetition parameters on each retrieval
  • Soft delete: Sets isActive = false without removing from store

Collection Naming

Collections follow the pattern {prefix}_{scope}_{scopeId}:

cogmem_user_agent-123
cogmem_thread_conv-456
cogmem_persona_helper-bot
cogmem_organization_acme-org

Memory Graph

Source: src/memory/retrieval/graph/IMemoryGraph.ts

The IMemoryGraph interface abstracts over two backends:

BackendImplementationUse Case
graphologyGraphologyMemoryGraphDev/testing, in-memory, fast
knowledge-graphKnowledgeGraphMemoryGraphProduction, wraps IKnowledgeGraph

Configure via graph.backend (default: 'knowledge-graph').

Edge Types

Edge TypeMeaningWeight
SHARED_ENTITYTraces mention the same entity0.5
TEMPORAL_SEQUENCETraces created within 5 minutes0.3
SAME_TOPICTraces share topic clustervaries
CONTRADICTSTraces contain conflicting informationvaries
SUPERSEDESOne trace replaces anothervaries
CAUSED_BYCausal relationshipvaries
CO_ACTIVATEDTraces retrieved together (Hebbian)grows
SCHEMA_INSTANCEEpisodic trace is instance of semantic schema0.6

Spreading Activation

Source: src/memory/retrieval/graph/SpreadingActivation.ts

Implements Anderson's ACT-R spreading activation model. Given seed nodes (top retrieval results), activation spreads through the graph to surface associated memories.

Algorithm (BFS)

  1. Seed nodes start at activation = 1.0
  2. Each hop: neighbor_activation = current * edge_weight * decayPerHop
  3. Multi-path summation (capped at 1.0) — traces reachable by multiple paths get boosted
  4. BFS with maxDepth (default 3) and activationThreshold (default 0.1) cutoffs
  5. Results sorted by activation descending, capped at maxResults (default 20)

Configuration

ParameterDefaultEffect
maxDepth3Maximum hops from seed nodes
decayPerHop0.5Activation multiplier per hop
activationThreshold0.1Minimum activation to continue
maxResults20Maximum activated nodes returned

Hebbian Learning

After retrieval, co-retrieved memories are recorded via recordCoActivation(). This strengthens CO_ACTIVATED edges between memories that are frequently retrieved together, implementing the Hebbian rule: "neurons that fire together wire together."

The learning rate (default 0.1) controls how quickly edge weights grow.


Observer/Reflector System

Memory Observer

Source: src/memory/pipeline/observation/MemoryObserver.ts

The observer monitors accumulated conversation tokens via a buffer. When the threshold is reached (default: 30,000 tokens), it extracts concise observation notes via a persona-configured LLM.

Personality bias in observation:

High TraitObserver Focus
EmotionalityEmotional shifts, tone changes, sentiment transitions
ConscientiousnessCommitments, deadlines, action items, structured plans
OpennessCreative tangents, novel ideas, exploratory topics
AgreeablenessUser preferences, rapport cues, communication style
HonestyCorrections, retractions, contradictions

Observation notes are typed: factual, emotional, commitment, preference, creative, correction.

Memory Reflector

Source: src/memory/pipeline/observation/MemoryReflector.ts

The reflector consolidates accumulated observation notes into long-term memory traces. Activates when note tokens exceed threshold (default: 40,000 tokens).

Pipeline:

  1. Merge redundant observations
  2. Elevate important facts to long-term traces
  3. Detect conflicts against existing memories
  4. Resolve conflicts based on personality:
    • High honesty: prefer newer information, supersede old
    • High agreeableness: keep both versions, note discrepancy
    • Default: prefer higher confidence

Target compression: 5-40x (many observations -> few high-quality traces).

Personality also controls memory style:

  • High conscientiousness: structured, well-organized traces
  • High openness: rich, associative traces with connections
  • Default: concise, factual traces

Prospective Memory

Source: src/memory/retrieval/prospective/ProspectiveMemoryManager.ts

Prospective memory handles future intentions — "remember to do X when Y happens."

Trigger Types

TypeFires WhenExample
time_basedCurrent time >= triggerAt"Remind me at 3pm"
event_basedNamed event in context.events"When user mentions deployment"
context_basedQuery embedding similarity > threshold"When we discuss pricing"

Registration

await manager.register({
content: 'Remind user about the PR review',
triggerType: 'time_based',
triggerAt: Date.now() + 3_600_000, // 1 hour
importance: 0.8,
recurring: false,
});

Checking

Checked each turn before prompt construction. Triggered items are injected into the "Reminders" section of the assembled memory context. Items can be recurring (re-trigger) or one-shot (marked triggered after firing).

Context-based triggers use cosine similarity between the cue embedding and the current query embedding, with a configurable similarityThreshold (default 0.7).


Consolidation Pipeline

Source: src/memory/pipeline/consolidation/ConsolidationPipeline.ts

Runs periodically (default: every hour) to maintain memory health. Five steps:

Step 1: Decay Sweep

Apply Ebbinghaus curve to all traces, soft-delete those below pruningThreshold (default 0.05). Emotional memories (intensity > 0.3) are protected.

Step 2: Co-Activation Replay

Process recent traces (last 24 hours) to create graph edges:

  • SHARED_ENTITY: Traces mentioning the same entity get connected (weight 0.5)
  • TEMPORAL_SEQUENCE: Traces created within 5 minutes get connected (weight 0.3)

Step 3: Schema Integration

Use detectClusters() on the memory graph (minimum cluster size: 5). For each cluster, invoke an LLM to summarize member traces into a single semantic knowledge node. Connect via SCHEMA_INSTANCE edges.

Step 4: Conflict Resolution

Scan CONTRADICTS edges and resolve based on personality:

  • High honesty (>0.6): Prefer newer information, soft-delete the older trace
  • Default: Prefer higher confidence (only if confidence difference >0.2)

Step 5: Spaced Repetition

Find traces past their nextReinforcementAt timestamp and boost them via recordAccess(), which increases stability and doubles the reinforcement interval.

Result

interface ConsolidationResult {
prunedCount: number; // Traces soft-deleted
edgesCreated: number; // Graph edges created
schemasCreated: number; // Semantic schemas from clusters
conflictsResolved: number; // Contradictions resolved
reinforcedCount: number; // Traces reinforced
totalProcessed: number; // Total traces examined
durationMs: number; // Cycle duration
}

Prompt Assembly

Source: src/memory/core/prompt/MemoryPromptAssembler.ts

Assembles memory context into a single formatted string within a token budget, split across six sections with overflow redistribution.

Default Budget Allocation

SectionBudget %Content
Working Memory15%Active context from slot buffer
Semantic Recall45%Retrieved semantic/procedural traces
Recent Episodic25%Retrieved episodic traces
Prospective Alerts5%Triggered reminders (Batch 2)
Graph Associations5%Spreading activation context (Batch 2)
Observation Notes5%Recent observer notes (Batch 2)

Overflow Redistribution

If a section uses less than its budget, the overflow flows to Semantic Recall. If Batch 2 sections are empty (no observer, no graph, no prospective items), their budgets are also redistributed to Semantic Recall.

Personality -> Formatting Style

The assembler selects a formatting style based on the dominant HEXACO trait:

Dominant TraitStyleOutput
ConscientiousnessstructuredBullet points, categories
OpennessnarrativeFlowing prose, connections
EmotionalityemotionalEmphasis on feelings, tone

Output Sections

## Active Context
- [ACTIVE] mt_1234 (activation: 0.85)

## Relevant Memories
- [semantic, score=0.82] User prefers TypeScript...

## Recent Experiences
- [episodic, score=0.71] Discussed deployment on Tuesday...

## Reminders
- [time_based] PR review is due

## Related Context
- [associated, activation=0.45] Related discussion about CI/CD...

## Observations
- User tends to ask follow-up questions about error handling

Token estimation uses ~4 chars per token heuristic.


Configuration

CognitiveMemoryConfig (Top-Level)

interface CognitiveMemoryConfig {
// --- Required dependencies ---
workingMemory: IWorkingMemory; // Existing AgentOS working memory
knowledgeGraph: IKnowledgeGraph; // Existing AgentOS knowledge graph
vectorStore: IVectorStore; // Vector store for embeddings
embeddingManager: IEmbeddingManager; // Embedding generation

// --- Agent identity ---
agentId: string;
traits: HexacoTraits; // { honesty, emotionality, extraversion, agreeableness, conscientiousness, openness }
moodProvider: () => PADState; // Callback to get current mood

// --- Feature detection ---
featureDetectionStrategy: 'keyword' | 'llm' | 'hybrid'; // Default: 'keyword'
featureDetectionLlmInvoker?: (systemPrompt: string, userPrompt: string) => Promise<string>;

// --- Tuning ---
encoding?: Partial<EncodingConfig>; // See defaults below
decay?: Partial<DecayConfig>; // See defaults below
workingMemoryCapacity?: number; // Default: 7 (Miller's number)
tokenBudget?: Partial<MemoryBudgetAllocation>;
collectionPrefix?: string; // Default: 'cogmem'

// --- Batch 2 (optional, no-op when absent) ---
observer?: Partial<ObserverConfig>;
reflector?: Partial<ReflectorConfig>;
graph?: Partial<MemoryGraphConfig>;
consolidation?: Partial<ConsolidationConfig>;
}

Encoding Defaults

ParameterDefaultDescription
baseStrength0.5Base encoding strength before modulation
flashbulbThreshold0.8Emotional intensity threshold for flashbulb
flashbulbStrengthMultiplier2.0Strength boost for flashbulb memories
flashbulbStabilityMultiplier5.0Stability boost for flashbulb memories
baseStabilityMs3,600,000Base stability (1 hour)

Decay Defaults

ParameterDefaultDescription
pruningThreshold0.05Strength below which traces are pruned
recencyHalfLifeMs86,400,000Recency boost half-life (24 hours)
interferenceThreshold0.7Cosine similarity threshold for interference

Graph Defaults

ParameterDefaultDescription
backend'knowledge-graph'Graph backend selection
maxDepth3Spreading activation max hops
decayPerHop0.5Activation decay per hop
activationThreshold0.1Minimum activation to continue
hebbianLearningRate0.1Co-activation edge strengthening rate

Consolidation Defaults

ParameterDefaultDescription
intervalMs3,600,000Run interval (1 hour)
maxTracesPerCycle500Max traces per cycle
mergeSimilarityThreshold0.92Similarity threshold for merging
minClusterSize5Min cluster size for schema integration

Quick Start

Minimal setup with core features (no LLM calls, no Batch 2):

import { CognitiveMemoryManager } from '@framers/agentos/memory';

const memory = new CognitiveMemoryManager();

await memory.initialize({
workingMemory: existingWorkingMemory,
knowledgeGraph: existingKnowledgeGraph,
vectorStore: existingVectorStore,
embeddingManager: existingEmbeddingManager,
agentId: 'my-agent',
traits: { openness: 0.7, conscientiousness: 0.8, emotionality: 0.5 },
moodProvider: () => ({ valence: 0, arousal: 0.3, dominance: 0 }),
featureDetectionStrategy: 'keyword',
});

// Encode a user message
const mood = { valence: 0.2, arousal: 0.4, dominance: 0 };
const trace = await memory.encode(
'I prefer deploying with Docker Compose',
mood,
'content',
{ type: 'semantic', scope: 'user', tags: ['deployment', 'docker'] },
);

// Retrieve relevant memories before prompt construction
const result = await memory.retrieve('How should I deploy?', mood, { topK: 5 });

// Assemble for prompt injection (1000 token budget)
const context = await memory.assembleForPrompt('How should I deploy?', 1000, mood);
console.log(context.contextText); // Formatted memory context
console.log(context.tokensUsed); // Actual tokens used

Full setup with all Batch 2 features:

const llmInvoker = async (system: string, user: string) => {
const response = await openai.chat.completions.create({
model: 'gpt-4o-mini',
messages: [{ role: 'system', content: system }, { role: 'user', content: user }],
});
return response.choices[0].message.content ?? '';
};

await memory.initialize({
// ... core config as above ...
observer: { activationThresholdTokens: 30_000, llmInvoker },
reflector: { activationThresholdTokens: 40_000, llmInvoker },
graph: { backend: 'knowledge-graph', maxDepth: 3, decayPerHop: 0.5 },
consolidation: { intervalMs: 3_600_000, minClusterSize: 5 },
});

// Observer: feed each message
await memory.observe('user', 'I need to deploy by Friday', mood);
await memory.observe('assistant', 'I can help with that deployment.', mood);

// Prospective: register a reminder
const pm = memory.getProspective();
await pm.register({
content: 'User needs deployment done by Friday',
triggerType: 'time_based',
triggerAt: fridayTimestamp,
importance: 0.9,
recurring: false,
});

// Consolidation runs automatically on timer, or manually:
const result = await memory.runConsolidation();
console.log(`Pruned ${result.prunedCount}, created ${result.schemasCreated} schemas`);

Integration with GMI

The Cognitive Memory System integrates into the GMI turn loop at three points:

After User Message (Encode)

// In the GMI turn handler, after receiving user input:
const mood = moodEngine.getCurrentState();
await cognitiveMemory.encode(userMessage, mood, gmiMood, {
type: 'episodic',
scope: 'user',
scopeId: userId,
sourceType: 'user_statement',
});

Before Prompt Construction (Retrieve + Assemble)

// Before building the system prompt:
const memoryContext = await cognitiveMemory.assembleForPrompt(
userMessage,
tokenBudget,
mood,
);
// Inject memoryContext.contextText into the prompt via PromptBuilder

After Response (Observe)

// After the LLM generates a response:
await cognitiveMemory.observe('assistant', assistantResponse, mood);

// Also feed user messages to observer for conversation monitoring:
await cognitiveMemory.observe('user', userMessage, mood);

Comparison with Mastra

The Cognitive Memory System addresses 12 limitations in Mastra's memory architecture:

#Mastra LimitationAgentOS Improvement
1Flat strength (all memories equal)HEXACO-modulated encoding strength with Yerkes-Dodson arousal curve
2No forgettingEbbinghaus exponential decay with configurable stability
3No spaced repetitionDesirable difficulty effect with interval doubling
4No working memory limitsBaddeley's model with personality-modulated capacity (5-9 slots)
5No emotional contextPAD model snapshot at encoding, mood-congruent retrieval bias
6Single retrieval signal (similarity)6-signal composite scoring (strength, similarity, recency, emotion, graph, importance)
7No memory graphIMemoryGraph with 8 edge types and spreading activation
8No interference modelingProactive and retroactive interference with configurable thresholds
9No consolidation5-step pipeline: decay sweep, replay, schema integration, conflict resolution, reinforcement
10No prospective memoryTime, event, and context-based triggers with recurring support
11No observer/reflectorPersonality-biased observation + LLM-driven consolidation into traces
12No provenance trackingFull source monitoring with confidence, verification count, and contradiction detection

Source Files

All source lives in packages/agentos/src/memory/:

FileExport
types.tsAll types: MemoryTrace, MemoryType, MemoryScope, ScoredMemoryTrace, etc.
config.tsCognitiveMemoryConfig, EncodingConfig, DecayConfig, defaults
CognitiveMemoryManager.tsCognitiveMemoryManager, ICognitiveMemoryManager
encoding/EncodingModel.tscomputeEncodingStrength, yerksDodson, buildEmotionalContext
encoding/ContentFeatureDetector.tscreateFeatureDetector, IContentFeatureDetector
decay/DecayModel.tscomputeCurrentStrength, updateOnRetrieval, computeInterference
decay/RetrievalPriorityScorer.tsscoreAndRankTraces, detectPartiallyRetrieved
working/CognitiveWorkingMemory.tsCognitiveWorkingMemory
store/MemoryStore.tsMemoryStore
prompt/MemoryPromptAssembler.tsassembleMemoryContext
prompt/MemoryFormatters.tsformatMemoryTrace, FormattingStyle
graph/IMemoryGraph.tsIMemoryGraph, MemoryEdgeType, ActivatedNode
graph/SpreadingActivation.tsspreadActivation
graph/GraphologyMemoryGraph.tsGraphologyMemoryGraph
graph/KnowledgeGraphMemoryGraph.tsKnowledgeGraphMemoryGraph
observation/MemoryObserver.tsMemoryObserver, ObservationNote
observation/MemoryReflector.tsMemoryReflector, MemoryReflectionResult
observation/ObservationBuffer.tsObservationBuffer
prospective/ProspectiveMemoryManager.tsProspectiveMemoryManager, ProspectiveMemoryItem
consolidation/ConsolidationPipeline.tsConsolidationPipeline, ConsolidationResult

Relationship to Persistent Working Memory

AgentOS provides two complementary working memory systems:

Baddeley Cognitive Working MemoryPersistent Markdown Working Memory
PurposeIn-session attention modelingCross-session user context
LifespanSingle session (in-memory)Persists on disk (~/.agentos/agents/{id}/working-memory.md)
UpdatesAutomatic activation decayAgent calls update_working_memory tool
FormatCapacity-limited slots (7±2)Free-form markdown template
Budget15% of prompt tokens5% of prompt tokens

Both are injected into the system prompt simultaneously. The persistent memory appears as ## Persistent Memory before the cognitive slots. See Persistent Working Memory for details.


References

The runtime constants, formulas, weights, and design decisions in this page are grounded in the cognitive-science and information-retrieval literature listed below. Citations are inline throughout the doc; this section consolidates them for review and audit.

Cognitive science foundations

  • Atkinson, R. C., & Shiffrin, R. M. (1968). Human memory: A proposed system and its control processes. In K. W. Spence & J. T. Spence (Eds.), The psychology of learning and motivation (Vol. 2, pp. 89–195). Academic Press. — Multi-store memory model. Wikipedia summary
  • Baddeley, A. D., & Hitch, G. (1974). Working memory. In G. H. Bower (Ed.), The psychology of learning and motivation (Vol. 8, pp. 47–89). Academic Press. — Working memory model with slot-based capacity. ScienceDirect
  • Baddeley, A. D. (2003). Working memory: Looking back and looking forward. Nature Reviews Neuroscience, 4(10), 829–839. — Updated synthesis. DOI
  • Tulving, E. (1972). Episodic and semantic memory. In E. Tulving & W. Donaldson (Eds.), Organization of memory (pp. 381–403). Academic Press. — LTM taxonomy (episodic / semantic / procedural). APA PsycNet
  • Ebbinghaus, H. (1885). Über das Gedächtnis: Untersuchungen zur experimentellen Psychologie (English: Memory: A Contribution to Experimental Psychology, 1913 trans. Ruger & Bussenius). Duncker & Humblot. — The original forgetting curve S(t) = S₀ · e^(-Δt / stability). Project Gutenberg (1913 trans.)
  • Yerkes, R. M., & Dodson, J. D. (1908). The relation of strength of stimulus to rapidity of habit-formation. Journal of Comparative Neurology and Psychology, 18(5), 459–482. — Inverted-U arousal curve. Wiley
  • Brown, R., & Kulik, J. (1977). Flashbulb memories. Cognition, 5(1), 73–99. — Flashbulb memory phenomenon. APA PsycNet
  • Bower, G. H. (1981). Mood and memory. American Psychologist, 36(2), 129–148. — Mood-congruent encoding. APA DOI
  • Anderson, J. R. (1983). A spreading activation theory of memory. Journal of Verbal Learning and Verbal Behavior, 22(3), 261–295. — ACT-R spreading activation. APA PsycNet · ACT-R home
  • Hebb, D. O. (1949). The Organization of Behavior: A Neuropsychological Theory. Wiley. — "Cells that fire together, wire together." Wikipedia summary
  • Johnson, M. K., Hashtroudi, S., & Lindsay, D. S. (1993). Source monitoring. Psychological Bulletin, 114(1), 3–28. — Source-monitoring framework underpinning the per-source decay multipliers. APA PsycNet

Personality structure

  • Ashton, M. C., & Lee, K. (2007). Empirical, theoretical, and practical advantages of the HEXACO model of personality structure. Personality and Social Psychology Review, 11(2), 150–166. — HEXACO six-factor model. SAGE Journals

Retrieval-augmented generation

  • Gao, L., Ma, X., Lin, J., & Callan, J. (2022). Precise zero-shot dense retrieval without relevance labels. arXiv preprint. — HyDE retrieval. arXiv:2212.10496
  • Edge, D., Trinh, H., Cheng, N., Bradley, J., Chao, A., Mody, A., Truitt, S., & Larson, J. (2024). From local to global: A graph RAG approach to query-focused summarization. arXiv preprint. — Microsoft GraphRAG. arXiv:2404.16130

Cognitive architectures for language agents

  • Park, J. S., O'Brien, J. C., Cai, C. J., Morris, M. R., Liang, P., & Bernstein, M. S. (2023). Generative agents: Interactive simulacra of human behavior. arXiv preprint. — Smallville generative agents — the canonical "persona + memory + reflection" demo. arXiv:2304.03442
  • Sumers, T. R., Yao, S., Narasimhan, K., & Griffiths, T. L. (2023). Cognitive architectures for language agents. arXiv preprint. — CoALA framework that AgentOS's memory taxonomy follows. arXiv:2309.02427

Benchmarks

  • Wu, D., Wang, J., Hu, P., et al. (2024). LongMemEval: Benchmarking chat assistants on long-term interactive memory. ICLR 2025. — The benchmark agentos-bench reports against. arXiv:2410.10813

Implementation references

Source files cited inline:

  • packages/agentos/src/memory/CognitiveMemoryManager.ts — top-level orchestrator
  • packages/agentos/src/memory/core/decay/DecayModel.ts — Ebbinghaus formula + spaced repetition
  • packages/agentos/src/memory/mechanisms/defaults.ts — eight cognitive mechanism defaults
  • packages/agentos/src/memory/retrieval/hyde/MemoryHydeRetriever.ts — HyDE retriever
  • packages/agentos/src/memory/retrieval/graph/graphrag/GraphRAGEngine.ts — GraphRAG implementation