Generalized Mind Instances (GMIs)
Most agent SDKs hand you a function call. You pass a system prompt and a list of tools, you get a string back. Close the connection. The agent doesn't exist anymore. The next call starts a new one that happens to share a name.
A Generalized Mind Instance — GMI — is the thing that exists between those calls. It owns a persona, a working memory, a cognitive memory layer that decays the way human memory decays, a sentiment tracker that follows the user's mood across turns, a metaprompt executor that assembles the system prompt fresh each turn from the current state, and a reasoning trace that keeps the last several hundred decision steps. When you agent({...}) you are constructing one of these. When you call .session(id).send(...) you are addressing it.
This page is an honest tour of the abstraction. Most descriptions of GMIs you'll see — including the concentric-ring diagram on agentos.sh — are presentation. The presentation is useful but it isn't the architecture. The architecture is a delegation pattern: a coordinator class with a dozen specialized collaborators, each owning one concern. Below is what's actually in the source tree at packages/agentos/src/cognitive_substrate/GMI.ts.
The shortest useful example
import { agent } from '@framers/agentos';
const analyst = agent({
provider: 'anthropic',
instructions: 'You are a thorough research analyst.',
personality: {
conscientiousness: 0.95,
openness: 0.85,
agreeableness: 0.7,
},
memory: { enabled: true, consolidation: true },
guardrails: ['pii-redaction', 'grounding-guard'],
});
const session = analyst.session('research-q1');
const reply = await session.send(
'Analyze Q1 market trends in AI infrastructure.'
);
console.log(reply.text);
Three things to notice:
agent()is the constructor. The same factory builds a single chat companion or a multi-agent orchestrator — the difference is configuration, not class hierarchy.personalityis HEXACO-shaped but plainly implemented. The six fields (honesty,emotionality,extraversion,agreeableness,conscientiousness,openness) are 0-to-1 scalars. The runtime encodes them as a human-readable trait string and appends it to the system prompt. There is no separate "personality model" running underneath. The cognitive memory mechanisms (covered below) read three of those values directly and modulate their behavior.session()is where state lives. Multiple sessions on the same agent maintain independent histories. Sessions are how a GMI talks to two users at once without cross-contamination.
What a GMI is composed of
The class definition tells the cleanest story. From GMI.ts (trimmed):
export class GMI implements IGMI {
public readonly gmiId: string;
public readonly creationTimestamp: Date;
// Injected dependencies
private workingMemory!: IWorkingMemory;
private promptEngine!: IPromptEngine;
private retrievalAugmentor?: IRetrievalAugmentor;
private toolOrchestrator!: IToolOrchestrator;
private llmProviderManager!: AIModelProviderManager;
private utilityAI!: IUtilityAI;
private cognitiveMemory?: ICognitiveMemoryManager;
// State
private state: GMIPrimeState;
private currentGmiMood: GMIMood;
private currentUserContext!: UserContext;
private currentTaskContext!: TaskContext;
private reasoningTrace: ReasoningTrace;
// Collaborators
private conversationHistoryManager: ConversationHistoryManager;
private memoryBridge: CognitiveMemoryBridge | null = null;
private sentimentTracker!: SentimentTracker;
private metapromptExecutor!: MetapromptExecutor;
// ...
}
Each name is doing one specific thing:
| Collaborator | What it owns |
|---|---|
ConversationHistoryManager | The turn buffer for the active session. Compacts old turns when the window fills. |
CognitiveMemoryBridge | The connection to long-term cognitive memory: encoding new traces, fetching old ones, applying decay. |
SentimentTracker | Analyzes user sentiment per turn and fires GMIEvents when patterns cross thresholds — those events trigger event-based metaprompt updates. |
MetapromptExecutor | Assembles the system prompt every turn from persona, traits, mood, retrieved memories, and active skills. |
IPromptEngine | Interpolates messages and tool schemas into the final wire-format payload for the LLM. |
IToolOrchestrator | Decides which tools to expose this turn, runs them, returns results. |
IRetrievalAugmentor | RAG retrieval over corpora that aren't memory (docs, web search, etc.). |
AIModelProviderManager | Routes the call to the configured provider, with fallback to others on failure. |
IUtilityAI | Smaller model jobs that don't need the main provider — JSON parsing, summarization, observations. |
ICognitiveMemoryManager | The actual memory store with the eight cognitive mechanisms (next section). |
Lifecycle is owned by GMIManager — it constructs GMIs, hands them their persona and config, tracks active instances by ID, and routes session-to-GMI mappings. When you build an agency of multiple GMIs, the manager is the registry that knows which mind owns which session.
The eight cognitive memory mechanisms
This is where the runtime stops looking like a thin wrapper around a chat API. From src/memory/mechanisms/defaults.ts, the eight mechanisms that operate on memory traces:
| Mechanism | What it does |
|---|---|
| Reconsolidation | Memories drift slightly each time they are recalled. The drift rate (default 0.05, capped at 0.4 per trace) is bounded so high-importance traces stay anchored. |
| Retrieval-induced forgetting | When a memory surfaces during retrieval, related-but-not-recalled memories get suppressed (similarity threshold 0.7, suppression 0.12, max 5 per query). Models the well-known psychological effect. |
| Involuntary recall | A small probability (default 0.08) that an old, related memory surfaces unprompted during a turn. Requires the trace to be at least 14 days old and above a minimum strength. |
| Metacognitive feeling-of-knowing | Surfaces "tip-of-the-tongue" partial activations: the GMI knows there's something relevant in memory even when it can't fully retrieve it. |
| Temporal gist | Old traces (60+ days, 2+ retrievals) collapse into compressed gist representations. Entities and emotional context are preserved; specific wording is not. |
| Schema encoding | New observations cluster against existing schema. Novel observations get a 1.3× encoding boost; congruent ones get a 0.85× discount. The runtime spends more strength on what surprises it. |
| Source-confidence decay | Different memory sources decay at different rates: a user statement holds at 1.0×, agent inference at 0.8×, reflection at 0.75×. The GMI trusts its own confabulations less over time than what the user explicitly said. |
| Emotion regulation | Reappraisal (rate 0.15) and suppression (above arousal 0.8) of emotionally loaded memories. Capped at 10 regulations per cycle so the GMI doesn't smooth out everything in one pass. |
All eight default to enabled. Pass cognitiveMechanisms: {} for defaults, or override per mechanism. Three of them — emotionality, conscientiousness, openness — are HEXACO-modulated: a more conscientious GMI consolidates more aggressively, a more open one weighs novelty harder, a more emotional one allows more involuntary recall.
The Ebbinghaus decay curve sits underneath all of this as the base decay model. The mechanisms above shape what gets stored, what gets forgotten preferentially, and how confident the GMI is in what it remembers. The decay rate is what determines when.
Retrieval is layered, not just embedding similarity
When a GMI needs to remember something, it doesn't run a single nearest-neighbor query. From CognitiveMemoryManager.retrieve():
- (Optional) HyDE hypothesis. When
options.hydeis on (or the active policy says always),MemoryHydeRetrieverprompts an LLM to generate a plausible memory the GMI would have stored about the query. The hypothesis embedding is then used as the search vector, because it sits closer to actual stored traces than a raw query like "that deployment thing last week". The source comments explicitly tie this to the generation effect in cognitive science. - Composite-scored vector query. Each candidate gets a weighted score combining current strength, embedding similarity, recency, emotional congruence with the user's mood, and importance. The default weights live in
CognitiveMemoryManagerand can be overridden per policy. - Spreading activation over the graph. When a Neo4j graph backend is configured, the top-5 results seed a spreading-activation pass through
GraphRAGEngine. Connected memories get a boost; the result set is re-sorted; co-activation is recorded for Hebbian-style learning so frequently-co-recalled memories link tighter over time. - (Optional) neural reranking. When a Cohere or LLM-judge reranker is plugged in, the cognitive composite is blended 0.7 cognitive / 0.3 neural — preserving decay, mood, and graph signals while letting a cross-encoder catch what the bi-encoder missed.
This is the layer cake the GMI sits on top of. The point isn't that "GraphRAG fallback when semantic fails" — that's a marketing simplification. The point is that each retrieval is a composite query whose score blends multiple cognitive signals, and the graph and reranker enrich that composite when they're available.
Personality, in practice
HEXACO sounds heavier than it is. The personality config is six numbers, encoded as a paragraph appended to the system prompt. That paragraph is what the LLM reads. There is no neural network "personality module" running in parallel.
What makes the trait values load-bearing is that the cognitive memory mechanisms read them directly. A high-emotionality GMI has higher involuntary-recall probability. A high-conscientiousness GMI consolidates more eagerly. A high-openness GMI gets a steeper novelty boost during schema encoding.
So the "personality" is two things stacked:
- Surface behavior — how the GMI talks. This comes from the trait string in the prompt and is mediated entirely by the LLM's interpretation.
- Memory shape — what the GMI remembers and forgets, and how confidently. This is enforced in code, independent of the LLM.
The first is interpretation. The second is mechanism. Both matter, but they're not the same thing, and conflating them is how you end up with prompt-engineered "personalities" that vanish on a model swap.
Multi-GMI: agency
A single GMI is a mind. An agency is a set of GMIs collaborating on a goal. Agency is in src/agents/agency/:
AgencyRegistry— tracks active agencies and the GMIs they contain.AgencyMemoryManager— shared memory across the agency's GMIs (separate from each GMI's private cognitive memory).AgentCommunicationBus— the message channel GMIs use to coordinate.
Each GMI in an agency keeps its own persona, traits, and cognitive memory. The agency adds a coordination layer on top. When you write agency({...agents}), the runtime spins up the registry, wires up the communication bus, and lets the orchestration strategy (sequential, parallel, debate, hierarchical, review-loop, graph) decide who runs when.
When the strategy is 'hierarchical' and emergent.enabled is true, the manager GMI also gets a spawn_specialist tool — synthesise a new specialist GMI mid-run when the static roster doesn't cover a sub-task. The synthesised GMI joins the live roster and becomes invokable as delegate_to_<role> on the manager's next turn. See Emergent Agency System for the spec, runtime sequence, and tested rejection paths.
Streaming output
session.send() returns a final reply. session.stream() returns an async iterable of typed chunks. The chunk types from IGMI.ts:
export enum GMIOutputChunkType {
TEXT_DELTA,
TOOL_CALL_REQUEST,
REASONING_STATE_UPDATE,
FINAL_RESPONSE_MARKER,
ERROR,
SYSTEM_MESSAGE,
USAGE_UPDATE,
LATENCY_REPORT,
UI_COMMAND,
}
GMIChunkTransformer maps these into the public AgentOSResponseChunkType. If you're building a UI on top of a GMI, you wire reactions to these types: stream the text deltas as they arrive, render tool calls as they fire, surface reasoning state if you're showing the GMI's thinking, finalize on the response marker. Memory formation events surface separately through the memory bridge.
What the homepage diagram is and isn't
The seven-ring diagram on agentos.sh is a visualization of capabilities, not architecture. The rings — channels, guardrails, tools, orchestration, memory, personality, LLM core — are useful as a mental model: the outer ones are surface area, the inner ones are cognitive substrate. They map roughly to actual collaborators in the source, but not one-to-one. The diagram exists to make a marketing point that lands in three seconds. The class structure exists to make the runtime maintainable. Both are doing different work.
If you came here looking for the seven layers as load-bearing architecture, you won't find them in the source. What you'll find is a delegation hub (the GMI class), a lifecycle manager (GMIManager), and the dozen specialized collaborators in the table above. That's the real shape.
Where things live
Quick map for navigating the source:
src/cognitive_substrate/GMI.ts— the class itselfsrc/cognitive_substrate/GMIManager.ts— lifecyclesrc/cognitive_substrate/personas/— persona definitions and loaderssrc/cognitive_substrate/persona_overlays/— per-session persona overlayssrc/memory/mechanisms/— the eight cognitive mechanisms + persona driftsrc/memory/retrieval/— semantic, HyDE, GraphRAG retrievalsrc/agents/agency/— multi-GMI coordinationsrc/api/— the publicagent(),agency(),generateText(),streamText()helpers
Further reading
- System Architecture — full module layout and request lifecycle
- Cognitive Memory — encoding, decay, and retrieval mechanics in depth
- Skills vs Tools vs Extensions — when each capability system applies
- Emergent Agency System — multi-GMI coordination and goal decomposition
- Sandbox & Security — how guardrails actually intercept tool calls and generation
- LLM Providers — the eleven provider implementations and the OpenRouter fan-out