Skip to main content

Interface: PerformOCROptions

Defined in: packages/agentos/src/api/performOCR.ts:49

Options accepted by performOCR.

Properties

apiKey?

optional apiKey: string

Defined in: packages/agentos/src/api/performOCR.ts:100

API key for the cloud provider. When omitted the key is read from the standard environment variable for the provider.


confidenceThreshold?

optional confidenceThreshold: number

Defined in: packages/agentos/src/api/performOCR.ts:81

Minimum confidence threshold (0-1) to accept an OCR result from a local tier without escalating to the next tier.

Only meaningful for the 'progressive' strategy.

Default

0.7

image

image: string | Buffer

Defined in: packages/agentos/src/api/performOCR.ts:58

Image source. Accepts any of:

  • File path — absolute or relative filesystem path (e.g. /tmp/scan.png).
  • URL — HTTP(S) URL to fetch the image from.
  • Base64 string — raw base64-encoded image data (with or without a data:image/...;base64, prefix).
  • Buffer — in-memory image bytes.

model?

optional model: string

Defined in: packages/agentos/src/api/performOCR.ts:94

Cloud LLM model override. When omitted the provider's default vision model is used (e.g. gpt-4o for OpenAI).


provider?

optional provider: string

Defined in: packages/agentos/src/api/performOCR.ts:88

Cloud LLM provider for tier-3 fallback (e.g. 'openai', 'anthropic', 'google'). When omitted the provider is auto-detected from environment variables (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.).


strategy?

optional strategy: "progressive" | "local-only" | "cloud-only"

Defined in: packages/agentos/src/api/performOCR.ts:71

Vision strategy controlling which tiers are used.

  • 'progressive' — start local, escalate to cloud only when confidence is below confidenceThreshold. Best cost/quality balance.
  • 'local-only' — never call cloud APIs. For air-gapped / privacy use.
  • 'cloud-only' — skip local processing, send straight to a cloud LLM. Highest quality but highest cost.

Default

'progressive'