Interface: IngestionConfig
Defined in: packages/agentos/src/memory/facade/types.ts:134
Controls how documents are split into chunks before being stored and indexed.
Properties
chunkOverlap?
optionalchunkOverlap:number
Defined in: packages/agentos/src/memory/facade/types.ts:156
Overlap between consecutive chunks in tokens/characters. Prevents context loss at chunk boundaries.
Default
64
chunkSize?
optionalchunkSize:number
Defined in: packages/agentos/src/memory/facade/types.ts:149
Target token/character count for each chunk.
Default
512
chunkStrategy?
optionalchunkStrategy:"fixed"|"semantic"|"hierarchical"|"layout"
Defined in: packages/agentos/src/memory/facade/types.ts:143
Strategy for splitting a document into indexable chunks.
'fixed'– split at a fixed token/character count.'semantic'– split at semantic boundaries (paragraphs, sections).'hierarchical'– build a tree of coarse → fine chunks (good for Q&A).'layout'– preserve the visual layout of the source (PDF columns etc.).
Default
'semantic'
doclingEnabled?
optionaldoclingEnabled:boolean
Defined in: packages/agentos/src/memory/facade/types.ts:177
Whether to use the Docling library for high-fidelity PDF/DOCX parsing.
When false, a simpler text-extraction path is used.
Default
false
extractImages?
optionalextractImages:boolean
Defined in: packages/agentos/src/memory/facade/types.ts:163
Whether to extract embedded images from documents (PDF, DOCX, etc.).
Extracted images are stored as ExtractedImage objects.
Default
false
ocrEnabled?
optionalocrEnabled:boolean
Defined in: packages/agentos/src/memory/facade/types.ts:170
Whether to run Optical Character Recognition on extracted images.
Requires extractImages: true.
Default
false
visionLlm?
optionalvisionLlm:string
Defined in: packages/agentos/src/memory/facade/types.ts:184
Vision-capable LLM model identifier used to caption extracted images.
Only consulted when extractImages: true.
Example
'gpt-4o'