Skip to main content

LLM Providers

Configure and switch between 9 LLM providers with automatic detection, fallback, and cost-aware routing.


Table of Contents

  1. Overview
  2. Provider Matrix
  3. Quick Start
  4. Auto-Detection Order
  5. Provider Configuration
  6. Fallback Behavior
  7. Cost Tiers
  8. Provider Details
  9. Programmatic Configuration
  10. Adding a Custom Provider
  11. Provider Capabilities Detail
  12. Related Documentation

Overview

AgentOS abstracts LLM access behind a unified LLMProviderManager interface. You configure providers via environment variables, and AgentOS handles model selection, streaming, tool calling, retries, and fallback routing.

Key features:

  • 9 providers supported out of the box
  • Auto-detection: Set an API key and the provider is available
  • Fallback: Automatic retry with alternate providers on failure
  • Cost-aware routing: Route requests to cheaper models when quality allows
  • Streaming: All providers support streaming with a unified async iterator
  • Tool calling: Unified function/tool calling across providers that support it

Provider Matrix

ProviderEnv VarDefault ModelStreamingTool CallingVisionEmbeddingCost Tier
OpenAIOPENAI_API_KEYgpt-4oYesYesYesYes$$$
AnthropicANTHROPIC_API_KEYclaude-sonnet-4-20250514YesYesYesNo$$$
GeminiGEMINI_API_KEYgemini-2.5-flashYesYesYesYes$$
GroqGROQ_API_KEYllama-3.3-70b-versatileYesYesNoNo$
TogetherTOGETHER_API_KEYmeta-llama/Llama-3.3-70B-Instruct-TurboYesYesNoYes$
MistralMISTRAL_API_KEYmistral-large-latestYesYesNoYes$$
xAIXAI_API_KEYgrok-2YesYesYesNo$$
OpenRouterOPENROUTER_API_KEYopenai/gpt-4oYesYesYes*Yes*Varies
OllamaOLLAMA_BASE_URLllama3.2YesPartialModel-dep.YesFree

*OpenRouter capabilities depend on the underlying model selected.


Quick Start

Option 1: Environment Variable (Simplest)

Set one API key and start using AgentOS:

export OPENAI_API_KEY=sk-...
import { createAgent } from '@framers/agentos';

const agent = await createAgent(); // Auto-detects OpenAI
const response = await agent.chat('Hello, world!');

Option 2: CLI Configuration

# Interactive setup wizard
wunderland setup

# Or set directly
wunderland config set llmProvider anthropic
wunderland config set llmModel claude-sonnet-4-20250514

Option 3: Programmatic

import { createAgent } from '@framers/agentos';

const agent = await createAgent({
llmProvider: 'anthropic',
llmModel: 'claude-sonnet-4-20250514',
});

Auto-Detection Order

When no provider is explicitly configured, AgentOS checks for API keys in this order and uses the first one found:

  1. OPENAI_API_KEY → OpenAI
  2. ANTHROPIC_API_KEY → Anthropic
  3. GEMINI_API_KEY → Google Gemini
  4. GROQ_API_KEY → Groq
  5. TOGETHER_API_KEY → Together AI
  6. MISTRAL_API_KEY → Mistral
  7. XAI_API_KEY → xAI
  8. OPENROUTER_API_KEY → OpenRouter
  9. OLLAMA_BASE_URL → Ollama

You can override auto-detection by setting llmProvider explicitly in your configuration or passing --provider <name> to CLI commands.


Provider Configuration

Each provider is configured via environment variables. You can set them in your shell, .env file, or ~/.wunderland/.env:

# ~/.wunderland/.env (auto-loaded by Wunderland CLI)

# Primary provider
OPENAI_API_KEY=sk-...

# Fallback provider
OPENROUTER_API_KEY=sk-or-...

# Local provider (no API key needed)
OLLAMA_BASE_URL=http://localhost:11434

Per-Agent Override

Individual agents can use different providers via agent.config.json:

{
"llmProvider": "anthropic",
"llmModel": "claude-sonnet-4-20250514",
"llmAuthMethod": "api-key"
}

Fallback Behavior

AgentOS supports automatic fallback when a provider request fails:

Primary Provider (e.g., Anthropic)
↓ fails (rate limit, timeout, error)
OpenRouter Fallback (if OPENROUTER_API_KEY is set)
↓ fails
Ollama Local Fallback (if OLLAMA_BASE_URL is set)
↓ fails
Error returned to caller

Configuring Fallback

import { createAgent } from '@framers/agentos';

const agent = await createAgent({
llmProvider: 'anthropic',
llmFallback: ['openrouter', 'ollama'], // Ordered fallback chain
llmFallbackModel: {
openrouter: 'anthropic/claude-sonnet-4-20250514',
ollama: 'llama3.2',
},
});

OpenRouter as Universal Fallback

Setting OPENROUTER_API_KEY automatically enables it as a fallback for any primary provider. OpenRouter routes to 200+ models across all major providers.

# Primary: Anthropic. Fallback: OpenRouter (automatic)
export ANTHROPIC_API_KEY=sk-ant-...
export OPENROUTER_API_KEY=sk-or-...

Cost Tiers

AgentOS tracks token usage and cost across all providers:

TierProvidersApproximate Cost (1M tokens)
$ (Budget)Groq, Together, Ollama (free)$0.00–$0.60
$$ (Standard)Gemini, Mistral, xAI, OpenRouter (varies)$0.50–$3.00
$$$ (Premium)OpenAI, Anthropic$3.00–$15.00

Cost-Aware Routing

import { createAgent } from '@framers/agentos';

const agent = await createAgent({
costOptimization: {
enabled: true,
maxCostPerTurn: 0.05, // USD budget per turn
preferCheaperModels: true, // Route simple queries to cheaper models
premiumModelThreshold: 0.7, // Complexity score threshold for premium models
},
});

See Cost Optimization for the full guide.


Provider Details

OpenAI

export OPENAI_API_KEY=sk-...
ModelContextVisionTool CallingNotes
gpt-4o128KYesYesBest all-around
gpt-4o-mini128KYesYesFast, cheap
o1200KYesYesReasoning model
o3-mini200KNoYesFast reasoning
gpt-image-1Image generation only

OAuth support: Use your ChatGPT subscription instead of an API key:

wunderland login   # Device code flow, same as Codex CLI

Anthropic

export ANTHROPIC_API_KEY=sk-ant-...
ModelContextVisionTool CallingNotes
claude-opus-4-20250514200KYesYesMost capable
claude-sonnet-4-20250514200KYesYesBest value
claude-haiku-3-5-20241022200KYesYesFastest

Google Gemini

export GEMINI_API_KEY=AIza...
ModelContextVisionTool CallingNotes
gemini-2.5-pro1MYesYesLargest context
gemini-2.5-flash1MYesYesFast, large context
gemini-2.0-flash1MYesYesPrevious gen

Groq

export GROQ_API_KEY=gsk_...
ModelContextVisionTool CallingNotes
llama-3.3-70b-versatile128KNoYesBest Groq model
llama-3.1-8b-instant128KNoYesUltra-fast
mixtral-8x7b-3276832KNoYesMixtral on Groq

Groq provides extremely fast inference (~500 tok/s) via custom LPU hardware.

Together AI

export TOGETHER_API_KEY=...
ModelContextVisionTool CallingNotes
meta-llama/Llama-3.3-70B-Instruct-Turbo128KNoYesDefault
meta-llama/Llama-3.1-405B-Instruct-Turbo128KNoYesLargest open model
mistralai/Mixtral-8x22B-Instruct-v0.164KNoYesMixtral

Mistral AI

export MISTRAL_API_KEY=...
ModelContextVisionTool CallingNotes
mistral-large-latest128KNoYesBest Mistral model
codestral-latest32KNoYesCode-optimized
mistral-small-latest32KNoYesFast, cheap

xAI (Grok)

export XAI_API_KEY=xai-...
ModelContextVisionTool CallingNotes
grok-2128KYesYesDefault
grok-2-mini128KNoYesFaster

OpenRouter

export OPENROUTER_API_KEY=sk-or-...

OpenRouter is a multi-provider proxy that routes to 200+ models. Specify the model using the provider/model format:

const agent = await createAgent({
llmProvider: 'openrouter',
llmModel: 'anthropic/claude-sonnet-4-20250514',
});

Popular OpenRouter models:

  • openai/gpt-4o
  • anthropic/claude-sonnet-4-20250514
  • google/gemini-2.5-flash
  • meta-llama/llama-3.3-70b-instruct

Ollama

export OLLAMA_BASE_URL=http://localhost:11434

Run any open model locally. No API key, no cost, full privacy.

# Auto-detect hardware, pull recommended models
wunderland ollama-setup

# Or pull models manually
ollama pull llama3.2
ollama pull codellama
ollama pull dolphin-mixtral
ModelParametersContextTool CallingNotes
llama3.23B/8B128KPartialGeneral-purpose
codellama7B/13B/34B16KNoCode-optimized
dolphin-mixtral8x7B32KNoUncensored
mistral7B32KPartialFast
phi33.8B128KNoSmall, fast

Programmatic Configuration

LLMProviderConfig

import { createAgent, type LLMProviderConfig } from '@framers/agentos';

const providerConfig: LLMProviderConfig = {
apiKey: process.env.ANTHROPIC_API_KEY!,
model: 'claude-sonnet-4-20250514',
baseUrl: undefined, // Custom base URL (optional)
extraHeaders: { // Additional headers (optional)
'X-Custom-Header': 'value',
},
};

const agent = await createAgent({
llmProvider: 'anthropic',
llmProviderConfig: providerConfig,
});

Switching Providers at Runtime

import { createAgent } from '@framers/agentos';

const agent = await createAgent({
llmProvider: 'openai',
});

// Switch to Anthropic for a specific request
const response = await agent.chat('Complex analysis task', {
provider: 'anthropic',
model: 'claude-opus-4-20250514',
});

Adding a Custom Provider

Implement the ILLMProvider interface to add a custom LLM provider:

import { registerLLMProvider, type ILLMProvider } from '@framers/agentos';

const myProvider: ILLMProvider = {
id: 'my-provider',
name: 'My Custom LLM',

models: [
{ id: 'my-model-v1', contextWindow: 32768, supportsTool: true, supportsVision: false },
],

async chat(messages, options) {
const response = await fetch('https://my-llm.com/v1/chat', {
method: 'POST',
headers: { Authorization: `Bearer ${process.env.MY_LLM_KEY}` },
body: JSON.stringify({ messages, model: options.model }),
});
const data = await response.json();
return {
content: data.choices[0].message.content,
usage: { promptTokens: data.usage.prompt_tokens, completionTokens: data.usage.completion_tokens },
};
},

async *stream(messages, options) {
// Implement SSE streaming
// yield { type: 'text_delta', content: '...' }
},
};

registerLLMProvider(myProvider);

Provider Capabilities Detail

Tool Calling Support

ProviderParallel ToolsStructured OutputTool ChoiceNotes
OpenAIYesYes (strict mode)auto/none/required/specificGold standard
AnthropicYesYesauto/any/specificStrong tool use
GeminiYesYesauto/none/anyGood support
GroqYesPartialauto/noneFast but basic
TogetherYesNoauto/noneModel-dependent
MistralYesNoauto/none/anyGood support
xAIYesNoauto/noneBasic tool use
OpenRouterModel-dependentModel-dependentModel-dependentPass-through
OllamaPartialNoauto/noneModel-dependent

Embedding Support

ProviderModelsDimensionsBatch Size
OpenAItext-embedding-3-small, text-embedding-3-large256–30722048
Geminitext-embedding-0047682048
Togethertogethercomputer/m2-bert-80M-*768512
Mistralmistral-embed1024512
Ollamanomic-embed-text, mxbai-embed-large768–1024512