Skip to main content

Class: TopicExtractor

Defined in: packages/agentos/src/query-router/TopicExtractor.ts:58

Extracts a compact, deduplicated topic list from a set of corpus chunks.

Designed to feed into the QueryClassifier's system prompt so the LLM knows which documentation topics exist without receiving the full corpus.

Example

const extractor = new TopicExtractor();
const topics = extractor.extract(corpusChunks, { maxTopics: 30 });
const promptBlock = extractor.formatForPrompt(topics);
// "Authentication (docs/auth.md)\nDatabase (docs/database.md)\n..."

Constructors

Constructor

new TopicExtractor(): TopicExtractor

Returns

TopicExtractor

Methods

extract()

extract(chunks, options?): TopicEntry[]

Defined in: packages/agentos/src/query-router/TopicExtractor.ts:70

Extract a deduplicated, sorted, and capped topic list from corpus chunks.

Deduplication key: heading::sourcePath. Two chunks with the same heading from the same source file are collapsed into a single entry.

Parameters

chunks

CorpusChunk[]

Corpus chunks to scan for topics.

options?

TopicExtractorOptions

Optional extraction parameters.

Returns

TopicEntry[]

Alphabetically sorted array of unique TopicEntry items, limited to maxTopics entries.


formatForPrompt()

formatForPrompt(topics): string

Defined in: packages/agentos/src/query-router/TopicExtractor.ts:112

Format a topic list into a compact multi-line string suitable for injection into a classifier system prompt.

Each line follows the pattern: TopicName (source/path.md)

Parameters

topics

TopicEntry[]

Array of topic entries to format.

Returns

string

Newline-separated string with one topic per line.