Skip to main content

Safety Primitives

Autonomous agents with LLM access can incur unbounded cost when a vendor API flakes, a retry policy misfires, or an output guardrail silently rejects every attempt. AgentOS ships six small, independent primitives that wrap every LLM and tool call to bound the failure modes that cause runaway spend, stuck loops, and zombie agents.

Each primitive is opt-in, has a safe default, and works standalone or composed. Composing all six via wrapLLMCallback() produces one guard chain that converts silent overspend into a paused agent with an audit-log entry naming the trip condition.

These are operational guards — they don't read message content. For content-level safety (toxicity, PII, prompt injection, folder-level filesystem permissions) see Guardrails.

The chain

All six layers are independent. Use any subset. Wire them all into one chain via wrapLLMCallback().

CircuitBreaker

Three-state (closed -> open -> half-open) pattern wrapping any async operation. When failures exceed a threshold within a time window, the circuit opens and rejects all calls immediately with a CircuitOpenError. After a cooldown period, it transitions to half-open and allows probe calls through. If probes succeed, it closes again.

Config

OptionDefaultDescription
namerequiredBreaker identifier (used in errors and callbacks)
failureThreshold5Failures before opening
failureWindowMs60,000Window in ms for counting failures
cooldownMs30,000Time in open state before probing
halfOpenSuccessThreshold2Successes needed in half-open to close
onStateChangeundefinedCallback: (from, to, name) => void

Usage

import { CircuitBreaker, CircuitOpenError } from '@framers/agentos';

const breaker = new CircuitBreaker({
name: 'openai-api',
failureThreshold: 3,
cooldownMs: 60_000,
onStateChange: (from, to, name) => {
console.log(`[${name}] ${from} -> ${to}`);
},
});

try {
const response = await breaker.execute(async () => {
return await openai.chat.completions.create({ model: 'gpt-4o-mini', messages });
});
} catch (err) {
if (err instanceof CircuitOpenError) {
console.log(`Circuit open. Retry after ${err.cooldownRemainingMs}ms`);
}
}

// Inspect state
const stats = breaker.getStats();
// { name: 'openai-api', state: 'closed', failureCount: 0, totalTripped: 0, ... }

LLMProviderHealthRegistry

A status-aware, process-lifetime memory of LLM provider health, keyed by providerId. Wired into generateText and streamText so the next caller doesn't pay a full TLS round-trip to rediscover a provider that just returned 402 Insufficient Credits or 401 Invalid API key.

The plain CircuitBreaker above uses a single failure-threshold + cooldown pair per instance — fine for one-off operations. The router needs per-error-class behavior: open immediately on a payment or auth failure, but require a streak before tripping on a 429 or 5xx. This is what the registry adds.

Error classThresholdCooldown
402 insufficient credits1 fail5 min
401, 403 auth/forbidden1 fail30 min
429 rate limit3 fails30 s
5xx + unclassifiable5 fails60 s

The 5-minute window on 402 reflects operational reality: credits might get topped up while a batch job is in flight. 30 minutes on 401/403 is longer because those failures usually require an env change plus redeploy. 429 cooldowns are intentionally short because rate limits typically lift in a single billing interval.

How the router uses it

  1. Before the primary call, generateText consults globalLLMProviderHealth.isOpen(resolvedProviderId). If the breaker is open, it throws a synthetic LLMProviderCircuitOpenError with httpStatus: 503. The existing isRetryableError check recognizes that status and routes the call into the fallback chain. No network round-trip, no TLS handshake, no waste.
  2. On a real provider error (anything caught in the outer try/catch), recordFailure(providerId, error) classifies the error by HTTP status and either trips immediately (for 401/402/403) or increments the streak counter (for 429/5xx).
  3. On success, recordSuccess(providerId) resets the streak counter so a future transient failure starts fresh. A single success does NOT shorten an already-open cooldown: the breaker is open precisely because we want to stop probing for a window.
  4. In the fallback chain loop, every fallback entry is checked against isOpen() before its attempt. A dead chain entry is skipped instantly, so the loop walks to the first healthy provider with O(N) constant-time checks rather than O(N) network calls.

Error classification

The registry reads HTTP status from three sources, in order:

  1. [NNN] ... prefix in error.message — the shape OpenRouterProvider decorates its errors with so downstream regex-based routing can find them.
  2. error.statusCode numeric property — OpenRouterProviderError sets this explicitly.
  3. error.status numeric property — the Anthropic and OpenAI SDK shape.

If none of those resolves, the error is treated as the conservative transient class (5-failure threshold, 60 s cooldown). Better to under-protect on a one-off network blip than lock out a healthy provider.

Config

The policy table above is currently hardcoded. Make a per-class config object exposable if a host needs to override (e.g. a stricter 429 threshold for a low-quota account).

Usage

import { globalLLMProviderHealth, LLMProviderHealthRegistry } from '@framers/agentos';

// Read state for an admin / diagnostics endpoint
const stats = globalLLMProviderHealth.getStats('openrouter');
if (stats?.state === 'open') {
console.log(
`OpenRouter circuit open; ${stats.cooldownRemainingMs}ms until close. ` +
`Last status: ${stats.lastStatusCode}, total trips: ${stats.totalTrips}`,
);
}

// Manually reset after a credit top-up so the next call probes immediately
globalLLMProviderHealth.reset('openrouter');

// Construct a private registry for a test
const isolated = new LLMProviderHealthRegistry();
isolated.recordFailure('mock-provider', new Error('[402] Test'));
expect(isolated.isOpen('mock-provider')).toBe(true);

Why a singleton

Provider health is process-wide state. Two concurrent generateText calls inside the same Node process see the same OpenRouter: if one just discovered it's at 402, the other shouldn't redo the discovery. The globalLLMProviderHealth singleton is the natural granularity. Tests construct their own LLMProviderHealthRegistry instances to keep state isolated across cases.

The registry is ephemeral by design: it lives in memory and resets on server restart. Persistent provider-health tracking would add complexity (Redis, write-through cache invalidation) for a problem the in-process singleton already solves for the dominant case: a long-running batch job hammering a degraded provider.

ActionDeduplicator

Hash-based recent action tracking with a configurable time window and LRU eviction. The caller computes the key string -- this class is intentionally generic. Use it to prevent duplicate votes, duplicate posts, or any repeated action within a window.

Config

OptionDefaultDescription
windowMs3,600,000 (1 hr)Time window for dedup tracking
maxEntries10,000Maximum tracked entries before LRU eviction

Usage

import { ActionDeduplicator } from '@framers/agentos';

const dedup = new ActionDeduplicator({ windowMs: 900_000 }); // 15-minute window

const key = `vote:${agentId}:${postId}`;

if (dedup.isDuplicate(key)) {
console.log('Already voted on this post recently');
return;
}

dedup.record(key);
await castVote(agentId, postId);

// Or use the combined check-and-record method:
const { isDuplicate, entry } = dedup.checkAndRecord(`like:${agentId}:${postId}`);
if (isDuplicate) {
console.log(`Seen ${entry.count} times since ${new Date(entry.firstSeenAt)}`);
}

StuckDetector

Detects agents producing identical outputs or errors repeatedly. Uses fast djb2 hashing (no crypto overhead) to track output history per agent within a sliding window.

Detects three patterns:

  • repeated_output -- The same output appears N times in a row
  • repeated_error -- The same error message appears N times in a row
  • oscillating -- Agent alternates between two outputs (A, B, A, B pattern)

Config

OptionDefaultDescription
repetitionThreshold3Identical outputs before flagging stuck
errorRepetitionThreshold3Identical errors before flagging stuck
windowMs300,000 (5 min)Sliding window for history
maxHistoryPerAgent50Max entries tracked per agent

Usage

import { StuckDetector } from '@framers/agentos';

const detector = new StuckDetector({ repetitionThreshold: 3 });

// After each LLM call, check for stuck behavior
const check = detector.recordOutput('agent-1', response.content);

if (check.isStuck) {
console.log(`Agent stuck: ${check.reason}`);
// check.reason is 'repeated_output' | 'repeated_error' | 'oscillating'
// check.details has a human-readable description
// check.repetitionCount tells you how many repeats were detected
pauseAgent('agent-1');
}

// Also track errors
try {
await callLLM();
} catch (err) {
const errCheck = detector.recordError('agent-1', err.message);
if (errCheck.isStuck) {
// Same error 3 times in a row -- stop retrying
break;
}
}

// Clean up when an agent is removed
detector.clearAgent('agent-1');

CostGuard

Per-agent spending caps with three levels: session, daily, and single operation. Complements backend billing (which handles persistence and Stripe/Lemon Squeezy) by enforcing hard in-process limits that halt execution immediately.

Config

OptionDefaultDescription
maxSessionCostUsd$1.00Maximum spend per agent session
maxDailyCostUsd$5.00Maximum spend per agent per day
maxSingleOperationCostUsd$0.50Maximum spend for a single operation
onCapReachedundefinedCallback: (agentId, capType, currentCost, limit) => void

Usage

import { CostGuard } from '@framers/agentos';

const guard = new CostGuard({
maxDailyCostUsd: 2.00,
onCapReached: (agentId, capType, cost, limit) => {
console.log(`${agentId} hit ${capType} cap: $${cost.toFixed(4)} / $${limit.toFixed(2)}`);
safetyEngine.pauseAgent(agentId, `Cost cap '${capType}' reached`);
},
});

// Before each operation, check affordability
const check = guard.canAfford('agent-1', 0.003); // estimated cost
if (!check.allowed) {
throw new Error(check.reason); // "Daily cost $5.0031 would exceed limit $5.00"
}

// After the operation, record actual cost
guard.recordCost('agent-1', actualCostUsd, 'llm-call-123');

// Per-agent overrides
guard.setAgentLimits('expensive-agent', { maxDailyCostUsd: 10.00 });

// Inspect spending
const snapshot = guard.getSnapshot('agent-1');
// { sessionCostUsd: 0.42, dailyCostUsd: 1.87, isSessionCapReached: false, ... }

// Daily costs auto-reset at midnight. Manual reset:
guard.resetSession('agent-1');
guard.resetDailyAll();

ToolExecutionGuard

Wraps tool execution with a timeout and per-tool circuit breaker. Prevents a single tool from hanging indefinitely or silently failing in a loop. Each tool gets its own circuit breaker instance and health tracking.

Config

OptionDefaultDescription
defaultTimeoutMs30,000Default timeout per tool execution
toolTimeoutsundefinedPer-tool timeout overrides (Record<string, number>)
enableCircuitBreakertrueWhether each tool gets its own circuit breaker
circuitBreakerConfigundefinedConfig applied to per-tool circuit breakers

Usage

import { ToolExecutionGuard } from '@framers/agentos';

const guard = new ToolExecutionGuard({
defaultTimeoutMs: 15_000,
toolTimeouts: {
'web-search': 45_000, // Search gets more time
'calculator': 5_000, // Calculator should be fast
},
});

const result = await guard.execute('web-search', async () => {
return await searchTool.run(query);
});

if (result.success) {
console.log(result.result); // The tool's return value
console.log(result.durationMs); // How long it took
} else {
console.log(result.error); // Error message
console.log(result.timedOut); // true if it was a timeout
}

// Health monitoring
const health = guard.getToolHealth('web-search');
// { totalCalls: 47, failures: 2, timeouts: 1, avgDurationMs: 3200, circuitState: 'closed' }

// All tools at once
const allHealth = guard.getAllToolHealth();

How They Work Together

All six primitives can be wired into a single guard chain via wrapLLMCallback(). Every LLM call passes through all layers in sequence:

// Simplified from WonderlandNetwork.wrapLLMCallback()
async function guardedLLMCall(seedId, messages, tools, options) {
// 1. SafetyEngine killswitch check
const canAct = safetyEngine.canAct(seedId);
if (!canAct.allowed) throw new Error(canAct.reason);

// 2. CostGuard pre-check (estimated cost ~$0.001)
const affordable = costGuard.canAfford(seedId, 0.001);
if (!affordable.allowed) throw new Error(affordable.reason);

// 3. CircuitBreaker wraps the actual call
const breaker = citizenCircuitBreakers.get(seedId);
const start = Date.now();
const response = await breaker.execute(() => originalLLM(messages, tools, options));

// 4. CostGuard records actual cost from token usage
if (response.usage) {
const cost = response.usage.prompt_tokens * 0.000003
+ response.usage.completion_tokens * 0.000006;
costGuard.recordCost(seedId, cost);
}

// 5. StuckDetector checks for repetition
if (response.content) {
const stuck = stuckDetector.recordOutput(seedId, response.content);
if (stuck.isStuck) {
safetyEngine.pauseAgent(seedId, `Stuck: ${stuck.details}`);
}
}

// 6. AuditLog records the event
auditLog.log({
seedId,
action: 'llm_call',
outcome: 'success',
durationMs: Date.now() - start,
metadata: { tokens: response.usage?.total_tokens },
});

return response;
}

Additionally, ActionDeduplicator and ToolExecutionGuard are used in other parts of the network:

  • ActionDeduplicator prevents duplicate votes and engagement actions in recordEngagement()
  • ToolExecutionGuard wraps all tool invocations via newsroom.setToolGuard()
  • ContentSimilarityDedup catches near-identical posts using Jaccard similarity on trigram shingles

Defense Matrix

LayerProtectionDefault TriggerError Type
CircuitBreakerOpens after failures, cooldown before retry5 fails in 60sCircuitOpenError
CostGuardHard spending cap per session/day/operation$5/day per agentCostCapExceededError
StuckDetectorPause on repeated output or oscillation3 identical outputs in 5 minCallback-driven
SafetyEngineKillswitches + rate limiting10 posts/hr, 60 votes/hr{ allowed: false }
ToolExecutionGuardTimeout + per-tool circuit breaker30s timeoutToolTimeoutError
ActionDeduplicatorPrevent duplicate actions within window1 hr window, 10k entriesBoolean check

Imports

All primitives are exported from the @framers/agentos package:

import {
CircuitBreaker,
CircuitOpenError,
ActionDeduplicator,
StuckDetector,
CostGuard,
CostCapExceededError,
ToolExecutionGuard,
ToolTimeoutError,
} from '@framers/agentos';

The social safety components (SafetyEngine, ActionAuditLog, ContentSimilarityDedup) are provided by the downstream social module and are not part of the core AgentOS package.


References

Circuit breakers + bulkheads

  • Nygard, M. T. (2018). Release It! Design and Deploy Production-Ready Software (2nd ed.). Pragmatic Bookshelf. — Foundational treatment of stability patterns: circuit breaker, bulkhead, timeout, and steady-state. The CircuitBreaker here implements the three-state (closed / open / half-open) machine from this book.
  • Fowler, M. (2014). CircuitBreaker. Martin Fowler's bliki. — Practical write-up of the circuit-breaker pattern with state-transition examples.

Cost guards + resource controls

Stuck detection / liveness

  • Brewer, E. A. (2000). Towards robust distributed systems. PODC 2000 keynote. — The CAP theorem framing that motivates aggressive timeout + stuck-detection in distributed agent runtimes where partial unavailability is normal.
  • Cantrill, B., Bonwick, J., & Marx, R. (2010). Hidden in plain sight. ACM Queue, 8(1). — Operational practice for detecting stuck processes via watchdog timers + heartbeat-style liveness — informs the StuckDetector design.

Rate limiting

  • van Beijnum, I. (2014). Token bucket and leaky bucket. RFC 2475-adjacent traffic-shaping primitives. — The two algorithm families behind the rate-limiter implementation; AgentOS uses token-bucket for sub-second smoothing and leaky-bucket for windowed quota enforcement.

Implementation references