Skip to main content

High-Level API

Everything is one import. Pick the function that fits your task:

import {
generateText, streamText, // Text generation
generateObject, streamObject, // Structured output (Zod validated)
generateImage, transferStyle, // Image generation & style transfer
generateVideo, analyzeVideo, // Video generation & analysis
generateMusic, generateSFX, // Audio generation
performOCR, // Vision / OCR
embedText, // Embeddings
agent, // Multi-turn agent sessions
agency, // Multi-agent teams
} from '@framers/agentos';

Quick Reference

FunctionWhat it doesExample
generateText()One-shot text generationawait generateText({ provider: 'openai', prompt: '...' })
streamText()Stream text in real-timefor await (const d of streamText({...}).textStream) {}
generateObject()Extract structured JSON (Zod)await generateObject({ schema: z.object({...}), prompt: '...' })
generateImage()Generate images (with character consistency)await generateImage({ provider: 'openai', prompt: '...' })
transferStyle()Style transfer between imagesawait transferStyle({ image: src, styleReference: ref, prompt: '...' })
generateVideo()Generate video from text/imageawait generateVideo({ prompt: '...' })
generateMusic()Generate musicawait generateMusic({ prompt: '...' })
performOCR()Extract text from imagesawait performOCR({ imagePath: './doc.png' })
embedText()Generate embeddingsawait embedText({ input: ['hello'] })
agent()Multi-turn sessions with memoryconst a = agent({ provider: 'openai' })
agency()Multi-agent teamsconst team = agency({ agents: {...}, strategy: 'parallel' })

All functions accept provider as a top-level key — no 'openai:gpt-4o' colon syntax needed (though it still works for backwards compatibility).

Provider Resolution

Calling Styles

AgentOS supports three styles for specifying provider and model. Provider-first is recommended:

// 1. Provider-first (recommended) — AgentOS picks the best default model
await generateText({ provider: 'openai', prompt: '...' });

// 2. Provider + explicit model — full control
await generateText({ provider: 'anthropic', model: 'claude-sonnet-4-5-20250929', prompt: '...' });

// 3. Legacy colon format — backwards compatible, still works
await generateText({ model: 'openai:gpt-4o', prompt: '...' });

Provider Defaults

When you supply provider without an explicit model, AgentOS resolves the default model for the requested task automatically:

ProviderTypeText defaultImage defaultEmbedding defaultEnv var
openaiCloudgpt-4ogpt-image-1text-embedding-3-smallOPENAI_API_KEY
anthropicCloudclaude-sonnet-4-5-20250929ANTHROPIC_API_KEY
geminiCloudgemini-2.5-flashGEMINI_API_KEY
openrouterCloudopenai/gpt-4oOPENROUTER_API_KEY
claude-code-cliLocalclaude-sonnet-4-5-20250929which claude
gemini-cliLocalgemini-2.5-flashwhich gemini
stabilityCloudstable-diffusion-xl-1024-v1-0STABILITY_API_KEY
replicateCloudblack-forest-labs/flux-1.1-proREPLICATE_API_TOKEN
ollamaLocalllama3.2stable-diffusionnomic-embed-textOLLAMA_BASE_URL
stable-diffusion-localLocalv1-5-pruned-emaonlySTABLE_DIFFUSION_LOCAL_BASE_URL

When neither provider nor model is given, AgentOS checks configured runtimes in order (OPENROUTER_API_KEYOPENAI_API_KEYANTHROPIC_API_KEYGEMINI_API_KEYGROQ_API_KEYTOGETHER_API_KEYMISTRAL_API_KEYXAI_API_KEYwhich claudewhich geminiOLLAMA_BASE_URL). Or call setDefaultProvider({ provider, apiKey }) once at boot to skip env vars entirely; every subsequent function inherits that default while still letting inline apiKey win when supplied.

Inline API Keys

Every function accepts apiKey and baseUrl as top-level parameters, overriding the corresponding environment variable for that call:

// Pass apiKey directly — useful for multi-tenant apps, tests, or dynamic config
await generateText({
provider: 'openai',
apiKey: 'sk-my-specific-key', // overrides OPENAI_API_KEY
prompt: 'Hello world',
});

// Works on agent() and agency() too
const bot = agent({
provider: 'anthropic',
apiKey: process.env.CUSTOMER_KEY, // per-customer key
instructions: 'You are a helpful assistant.',
});

Local Providers

Local providers don't require API keys — just a baseUrl (or the corresponding env var):

// Ollama — runs any GGUF model locally
await generateText({
provider: 'ollama',
model: 'llama3.2',
prompt: 'Explain quantum entanglement simply.',
baseUrl: 'http://localhost:11434', // or set OLLAMA_BASE_URL
});

// Anthropic fallback: if ANTHROPIC_API_KEY is unset but OPENROUTER_API_KEY is set,
// AgentOS automatically routes anthropic requests through OpenRouter.

generateText()

import { generateText } from '@framers/agentos';

// Provider-first (recommended): AgentOS picks the default model for the provider.
const { text, usage } = await generateText({
provider: 'openai',
prompt: 'Summarize the TCP three-way handshake in 3 bullets.',
});

console.log(text);
console.log(usage.totalTokens);

// Legacy format — still supported:
// const { text } = await generateText({ model: 'openai:gpt-4.1-mini', prompt: '...' });

generateText({ tools }) and streamText({ tools }) now accept three useful forms:

  • A named high-level tool map
  • An ExternalToolRegistry (Record, Map, or iterable)
  • A prompt-only ToolDefinitionForLLM[]

External registries are exposed to the model and executed when called. Prompt-only ToolDefinitionForLLM[] are exposed to the model too, but if the model calls one without an executor attached, AgentOS returns an explicit tool error instead of silently no-oping.

The same tools forms now work on agent({ tools }) and agency({ tools }). When an agency-level tool set is combined with per-agent tools, AgentOS normalizes both sides first and then merges by tool name, with the per-agent tool winning on collisions.

Persist helper usage for later inspection:

import { generateText, getRecordedAgentOSUsage } from '@framers/agentos';

await generateText({
provider: 'openai',
prompt: 'Summarize QUIC in one sentence.',
usageLedger: {
enabled: true,
sessionId: 'demo-session',
},
});

const totals = await getRecordedAgentOSUsage({ enabled: true, sessionId: 'demo-session' });
console.log(totals.totalTokens);

streamText()

import { streamText } from '@framers/agentos';

const result = streamText({
provider: 'openai',
prompt: 'Stream a short explanation of how TLS differs from TCP.',
});

for await (const delta of result.textStream) {
process.stdout.write(delta);
}

console.log(await result.text);

agency().stream()

streamText() is a single-call raw stream. agency().stream() separates raw live chunks from the finalized post-guardrail/post-HITL answer:

import { agency, type AgencyStreamResult } from '@framers/agentos';

const team = agency({
provider: 'openai',
strategy: 'sequential',
agents: {
researcher: { instructions: 'Collect the key facts.' },
writer: { instructions: 'Turn the facts into a concise answer.' },
},
hitl: {
approvals: { beforeReturn: true },
handler: async () => ({
approved: true,
modifications: { output: 'Approved for delivery.' },
}),
},
});

const stream: AgencyStreamResult = team.stream('Summarize HTTP/3 rollout risks.');

for await (const chunk of stream.textStream) {
process.stdout.write(chunk); // raw live output
}
process.stdout.write('\n');

for await (const approved of stream.finalTextStream) {
console.log('Approved answer:', approved);
}

console.log('Agent calls:', await stream.agentCalls);
console.log('Final text:', await stream.text);

Use:

  • textStream for low-latency token UX
  • finalTextStream or text for the finalized approved answer
  • fullStream when you also need structured events like final-output

See Agency API and Streaming Semantics for the full contract.

QueryRouter

Use QueryRouter when you want grounded answers over a local markdown corpus without booting the full AgentOS runtime.

import { QueryRouter } from '@framers/agentos';

const router = new QueryRouter({
knowledgeCorpus: ['./docs', './packages/agentos/docs'],
availableTools: ['web_search'],
});

await router.init();

console.log(router.getCorpusStats());

const result = await router.route('How does memory retrieval work?');
console.log(result.answer);
console.log(result.tiersUsed);
console.log(result.fallbacksUsed);

await router.close();

router.getCorpusStats() returns a QueryRouterCorpusStats snapshot that tells you what is actually live in the current host:

  • corpus size: configuredPathCount, chunkCount, topicCount, sourceCount
  • bundled platform knowledge: platformKnowledge.total plus per-category counts
  • retrieval path: vector+keyword-fallback or keyword-only
  • embedding health: embeddingStatus
  • runtime truth: graphRuntimeMode, rerankRuntimeMode, deepResearchRuntimeMode

Built-in status meanings:

  • embeddingStatus: 'active' means vector embeddings initialized successfully
  • embeddingStatus: 'disabled-no-key' means init stayed keyword-only because no embedding credential was available
  • embeddingStatus: 'failed-init' means vector init was attempted but fell back to keyword-only mode after an error
  • graphRuntimeMode: 'heuristic' means same-document / heading-overlap expansion
  • rerankRuntimeMode: 'heuristic' means the built-in lexical reranker
  • deepResearchRuntimeMode: 'heuristic' means the built-in local-corpus research synthesis path

Hosts can inject real graphExpand, rerank, and deepResearch hooks in the constructor; those modes then become active.

See Query Router for the full contract and host-hook examples.

generateImage()

import { generateImage } from '@framers/agentos';

// Provider-first: resolves to gpt-image-1 by default for openai.
const result = await generateImage({
provider: 'openai',
prompt: 'A cinematic neon city skyline reflected in rain at night.',
outputFormat: 'png',
});

console.log(result.provider);
console.log(result.images[0]?.mimeType);

generateVideo()

import { generateVideo } from '@framers/agentos';

const result = await generateVideo({
prompt: 'A drone flying over a misty forest at sunrise.',
timeoutMs: 180_000,
onProgress: (event) => console.log(event.status, event.progress, event.message),
providerPreferences: {
preferred: ['runway', 'replicate'],
blocked: ['fal'],
},
});

console.log(result.provider);
console.log(result.videos[0]?.url);

analyzeVideo()

analyzeVideo() auto-creates a VisionPipeline, uses the structured VideoAnalyzer pipeline under the hood, and auto-wires STT when a supported speech provider credential is available (OPENAI_API_KEY, DEEPGRAM_API_KEY, ASSEMBLYAI_API_KEY, or Azure Speech env vars).

import { analyzeVideo } from '@framers/agentos';

const result = await analyzeVideo({
videoUrl: 'https://example.com/demo.mp4',
prompt: 'What is the product demo showing?',
transcribeAudio: true,
maxFrames: 12,
});

console.log(result.description);
console.log(result.fullTranscript);
console.log(result.scenes?.length);

Host requirement: ffmpeg and ffprobe must be installed and available on PATH.

generateMusic() and generateSFX()

import { generateMusic, generateSFX } from '@framers/agentos';

const music = await generateMusic({
prompt: 'Warm analog synthwave with a slow build.',
timeoutMs: 180_000,
onProgress: (event) => console.log('music', event.status, event.message),
providerPreferences: {
preferred: ['suno', 'udio'],
},
});

const sfx = await generateSFX({
prompt: 'Heavy vault door closing with metallic reverb.',
timeoutMs: 60_000,
onProgress: (event) => console.log('sfx', event.status, event.message),
providerPreferences: {
preferred: ['elevenlabs-sfx', 'stable-audio'],
},
});

console.log(music.audio[0]?.url);
console.log(sfx.audio[0]?.url);

Media Provider Preferences

Image, video, music, and SFX helpers accept providerPreferences so callers can reorder, block, or weight providers without hard-coding a single backend:

import type { MediaProviderPreference } from '@framers/agentos';

const preferredCloudOnly: MediaProviderPreference = {
preferred: ['runway', 'replicate'],
blocked: ['musicgen-local', 'audiogen-local'],
weights: { runway: 3, replicate: 1 },
};

When weights are present, AgentOS chooses the primary provider from the resolved list using weighted selection and keeps the remaining providers in order as fallbacks.

Built-in Image Providers

ProviderTypeDefault modelAPI key env var
openaiCloud APIgpt-image-1OPENAI_API_KEY
stabilityCloud APIstable-diffusion-xl-1024-v1-0STABILITY_API_KEY
replicateCloud APIblack-forest-labs/flux-1.1-proREPLICATE_API_TOKEN
openrouterCloud APIOPENROUTER_API_KEY
ollamaLocalstable-diffusionNone (uses baseUrl)
stable-diffusion-localLocalv1-5-pruned-emaonlyNone (uses baseUrl)

Provider-Specific Options

Use the common options for the simple path, then drop down to namespaced providerOptions when you need provider-native controls:

import { generateImage } from '@framers/agentos';

const poster = await generateImage({
provider: 'stability',
model: 'stable-image-core',
prompt: 'An art deco travel poster for a moon colony',
negativePrompt: 'text, watermark',
providerOptions: {
stability: {
stylePreset: 'illustration',
seed: 42,
cfgScale: 8,
},
},
});

console.log(poster.images[0]?.mimeType);

Replicate and OpenRouter work the same way:

const replicateResult = await generateImage({
provider: 'replicate',
model: 'black-forest-labs/flux-schnell',
prompt: 'A product photo of a titanium watch on black stone',
aspectRatio: '16:9',
providerOptions: {
replicate: {
outputQuality: 90,
input: {
go_fast: true,
},
},
},
});

Local Image Generation

Run Stable Diffusion locally without any API key:

// Via Ollama (if your Ollama install has a stable-diffusion model)
const local = await generateImage({
provider: 'ollama',
model: 'stable-diffusion',
prompt: 'A watercolor landscape of rolling hills',
baseUrl: 'http://localhost:11434', // or set OLLAMA_BASE_URL
});

// Via local Stable Diffusion WebUI (Automatic1111 / ComfyUI)
const sdLocal = await generateImage({
provider: 'stable-diffusion-local',
model: 'v1-5-pruned-emaonly',
prompt: 'A brutalist house in fog',
baseUrl: 'http://localhost:7860', // or set STABLE_DIFFUSION_LOCAL_BASE_URL
});

Custom Image Provider

Register a provider factory for backends not covered by the built-ins:

import { generateImage, registerImageProviderFactory, type IImageProvider } from '@framers/agentos';

class ComfyUIProvider implements IImageProvider {
providerId = 'comfyui';
isInitialized = false;
defaultModelId = 'sdxl';

async initialize() {
this.isInitialized = true;
}

async generateImage(request) {
return {
created: Math.floor(Date.now() / 1000),
modelId: request.modelId,
providerId: this.providerId,
images: [{ url: 'https://example.invalid/image.png' }],
usage: { totalImages: 1 },
};
}
}

registerImageProviderFactory('comfyui', () => new ComfyUIProvider());

await generateImage({
provider: 'comfyui',
model: 'sdxl',
prompt: 'A brutalist house in fog',
});

agent()

import { agent } from '@framers/agentos';

const researcher = agent({
provider: 'openai',
instructions: 'You are a concise research assistant.',
memory: {
types: ['episodic', 'semantic'],
working: { enabled: true },
},
maxSteps: 4,
});

const session = researcher.session('demo');

const first = await session.send('What is QUIC?');
console.log(first.text);

const second = await session.send('Compare it to TCP.');
console.log(second.text);

console.log(await session.usage());

agent({ tools }) accepts the same three forms as generateText({ tools }) and streamText({ tools }): named tool maps, ExternalToolRegistry (Record, Map, or iterable), and prompt-only ToolDefinitionForLLM[].

Runnable examples in the package source:

  • packages/agentos/examples/high-level-api.mjs
  • packages/agentos/examples/generate-image.mjs
  • packages/agentos/examples/agentos-config-tools.mjs

Full runtime: AgentOS

import { AgentOS, AgentOSResponseChunkType } from '@framers/agentos';
import { createTestAgentOSConfig } from '@framers/agentos';

const agent = new AgentOS();
await agent.initialize(
await createTestAgentOSConfig({
tools: {
open_profile: {
description: 'Load a saved profile record by ID.',
inputSchema: {
type: 'object',
properties: { profileId: { type: 'string' } },
required: ['profileId'],
},
execute: async ({ profileId }) => ({
success: true,
output: { profile: { id: profileId, preferredTheme: 'solarized' } },
}),
},
},
})
);

for await (const chunk of agent.processRequest({
userId: 'user-1',
sessionId: 'session-1',
textInput: 'Explain how TCP handshakes work',
})) {
if (chunk.type === AgentOSResponseChunkType.TEXT_DELTA) {
process.stdout.write(chunk.textDelta);
}
}

AgentOSConfig.tools now accepts the same three forms as the high-level helpers: named tool maps, ExternalToolRegistry (Record, Map, or iterable), and prompt-only ToolDefinitionForLLM[]. AgentOS normalizes those inputs during initialize(...) and registers them into the shared ToolOrchestrator, so direct processRequest() turns can plan against and execute them without helper wrappers. If a config-registered tool collides with an extension or pack tool name, the config tool wins at registration time.

If those external tool calls are AgentOS-registered tools, prefer processRequestWithRegisteredTools(...). It executes the registered tools with the correct live-turn ToolExecutionContext and resumes the stream for you:

import {
AgentOS,
AgentOSResponseChunkType,
processRequestWithRegisteredTools,
} from '@framers/agentos';

const agent = await AgentOS.create();

for await (const chunk of processRequestWithRegisteredTools(agent, {
userId: 'user-1',
sessionId: 'session-1',
textInput: 'Search memory for my preferences',
})) {
if (chunk.type === AgentOSResponseChunkType.TEXT_DELTA) {
process.stdout.write(chunk.textDelta);
}
}

If a live turn can mix AgentOS-registered tools with a stable host-managed tool map, either configure externalTools once on AgentOS.initialize(...) or pass externalTools to processRequestWithRegisteredTools(...). It can be a record, Map, or iterable of tool-like executors, and only missing tool names will run through that host registry. Per-call externalTools override the configured registry by tool name. Use externalTools for helper-level fallback execution; use AgentOSConfig.tools when the tool should be permanently registered and prompt-visible on direct runtime turns too. If an externalTools entry also provides description and inputSchema, the helper temporarily registers a proxy tool so the model can see and plan against it during the turn. Execution-only entries without prompt metadata still work for fallback execution, but they are not visible to the model up front.

If you need fully dynamic routing instead of a fixed tool map, keep using fallbackExternalToolHandler.

For custom host-managed tools, keep using processRequestWithExternalTools(...) and provide your own execution callback.

If you are building a lower-level/custom GMI path and only need prompt-visible host tool schemas, configure AgentOSConfig.externalTools and call agent.listExternalToolsForLLM(). That returns only the prompt-aware host tools. You can turn those into raw OpenAI-style function schemas with formatToolDefinitionsForOpenAI(...) or directly from the registry with formatExternalToolsForOpenAI(...).

processRequestWithExternalTools(...) is the simplest path while the same AgentOS runtime stays alive. For restart-safe external tool execution, AgentOS also persists actionable external pauses into the conversation metadata. A fresh runtime can recover the pending request with getPendingExternalToolRequest(conversationId, userId) and continue on a new stream with resumeExternalToolRequest(...):

If the pending tool calls are AgentOS-registered tools, prefer resumeExternalToolRequestWithRegisteredTools(...). It executes the registered tools with the correct resume-time ToolExecutionContext and then resumes the stream for you.

import {
AgentOS,
AgentOSResponseChunkType,
resumeExternalToolRequestWithRegisteredTools,
} from '@framers/agentos';

const agent = await AgentOS.create();
const pending = await agent.getPendingExternalToolRequest('conv-1', 'user-1');

if (pending) {
for await (const chunk of resumeExternalToolRequestWithRegisteredTools(agent, pending, {
organizationId: 'org-123',
})) {
if (chunk.type === AgentOSResponseChunkType.TEXT_DELTA) {
process.stdout.write(chunk.textDelta);
}
}
}

If a persisted pause can mix AgentOS-registered tools with a stable host-managed tool map, either configure externalTools once on AgentOS.initialize(...) or pass externalTools to resumeExternalToolRequestWithRegisteredTools(...). The helper will execute the registered tool calls itself and only delegate missing tool names to that host registry before resuming the stream. Per-call externalTools override the configured registry by tool name. Prompt-aware entries with description and inputSchema are also registered temporarily during the resumed stream so follow-up model calls can plan against the same host tools.

If you need fully dynamic routing instead of a fixed tool map, keep using fallbackExternalToolHandler.

For custom host-managed tools that are not registered in AgentOS, keep using resumeExternalToolRequest(...) directly and supply your own tool results.

This recovery path assumes the conversation store is still available after the original process exits.

Guidance

  • Show high-level examples first in README and landing guides.
  • Keep low-level AgentOS examples in architecture, advanced usage, extensions, workflows, and runtime-control docs.
  • Document both layers explicitly. They are complementary, not competing.
  • Keep generateImage() provider-agnostic at the API boundary, but expose provider-specific knobs through providerOptions when needed.
  • Do not force downstream libraries to adopt agent() unless the helper reaches feature parity with their runtime needs.