Skip to main content

Class: PipelineVisionProvider

Defined in: packages/agentos/src/core/vision/providers/PipelineVisionProvider.ts:70

Adapts the full VisionPipeline to the narrow IVisionProvider interface used by the multimodal indexer.

The pipeline's process() method runs all configured tiers and returns a rich VisionResult. This adapter extracts just the text field that the indexer needs for embedding generation.

For callers that need the full pipeline result (embeddings, layout, confidence, regions), use processWithFullResult() instead.

Example

const provider = new PipelineVisionProvider(pipeline);

// Simple: just the description text
const text = await provider.describeImage(imageUrl);

// Advanced: full pipeline result
const result = await provider.processWithFullResult(imageBuffer);
console.log(result.embedding); // CLIP vector
console.log(result.layout); // Florence-2 layout

Implements

Constructors

Constructor

new PipelineVisionProvider(pipeline): PipelineVisionProvider

Defined in: packages/agentos/src/core/vision/providers/PipelineVisionProvider.ts:93

Create a new pipeline vision provider.

Parameters

pipeline

VisionPipeline

An initialized VisionPipeline instance. The caller retains ownership and is responsible for calling pipeline.dispose() when done.

Returns

PipelineVisionProvider

Throws

If pipeline is null or undefined.

Example

const pipeline = new VisionPipeline({ strategy: 'progressive' });
const provider = new PipelineVisionProvider(pipeline);

Methods

describeImage()

describeImage(image): Promise<string>

Defined in: packages/agentos/src/core/vision/providers/PipelineVisionProvider.ts:122

Generate a text description of the provided image by running it through the full vision pipeline.

This satisfies the IVisionProvider contract. The image passes through all configured tiers (OCR, handwriting, document-ai, cloud) and the best extracted text is returned.

Parameters

image

string

Image as a URL string (https://... or data:image/...).

Returns

Promise<string>

Text description or extracted content from the image.

Throws

If all pipeline tiers fail to produce output.

Throws

If the pipeline has been disposed.

Example

const description = await provider.describeImage(imageUrl);
console.log(description);

Implementation of

IVisionProvider.describeImage


getPipeline()

getPipeline(): VisionPipeline

Defined in: packages/agentos/src/core/vision/providers/PipelineVisionProvider.ts:174

Get a reference to the underlying pipeline for direct access.

Useful when the caller needs to invoke pipeline-specific methods like extractText(), embed(), or analyzeLayout() that aren't exposed through the IVisionProvider interface.

Returns

VisionPipeline

The underlying VisionPipeline instance.

Example

const layout = await provider.getPipeline().analyzeLayout(image);

processWithFullResult()

processWithFullResult(image): Promise<VisionResult>

Defined in: packages/agentos/src/core/vision/providers/PipelineVisionProvider.ts:156

Process an image through the full pipeline and return the complete VisionResult — including embeddings, layout, confidence scores, and per-tier breakdowns.

Use this when you need more than just the text description (e.g. to store the CLIP embedding alongside the text embedding in the vector store).

Parameters

image

Image data as a Buffer or URL string.

string | Buffer

Returns

Promise<VisionResult>

Full vision pipeline result.

Throws

If all pipeline tiers fail.

Throws

If the pipeline has been disposed.

Example

const result = await provider.processWithFullResult(imageBuffer);

// Use both text embedding (via indexer) and image embedding (via CLIP)
if (result.embedding) {
await imageVectorStore.upsert('images', [{
id: docId,
embedding: result.embedding,
metadata: { text: result.text },
}]);
}