Stable identifier for the document (chunk IDs will derive from this).
Raw text that will be chunked and embedded.
Optional dataOptional override for which data source / collection to push this document into.
Optional sourceOriginal source pointer (URL, file path, API, etc.).
Optional metadataArbitrary metadata stored alongside chunks; values must be vector-store friendly.
Optional languageISO language tag for the content.
Optional timestampISO timestamp describing when this content was produced/updated.
Optional embeddingOptional pre-computed embedding vector.
Optional embeddingIdentifier of the embedding model used when embedding is supplied.
Represents raw document content provided for ingestion.