Class: Evaluator
Defined in: packages/agentos/src/core/evaluation/Evaluator.ts:206
Agent evaluation framework implementation.
Implements
Constructors
Constructor
new Evaluator():
Evaluator
Defined in: packages/agentos/src/core/evaluation/Evaluator.ts:210
Returns
Evaluator
Methods
compareRuns()
compareRuns(
runId1,runId2):Promise<EvalComparison>
Defined in: packages/agentos/src/core/evaluation/Evaluator.ts:392
Compares two evaluation runs.
Parameters
runId1
string
First run ID
runId2
string
Second run ID
Returns
Promise<EvalComparison>
Comparison results
Implementation of
evaluateTestCase()
evaluateTestCase(
testCase,actualOutput,config?):Promise<EvalTestResult>
Defined in: packages/agentos/src/core/evaluation/Evaluator.ts:306
Evaluates a single test case.
Parameters
testCase
The test case
actualOutput
string
The agent's actual output
config?
Evaluation configuration
Returns
Promise<EvalTestResult>
Test result
Implementation of
generateReport()
generateReport(
runId,format):Promise<string>
Defined in: packages/agentos/src/core/evaluation/Evaluator.ts:433
Generates a report for a run.
Parameters
runId
string
Run ID
format
Report format
"json" | "markdown" | "html"
Returns
Promise<string>
Report content
Implementation of
getRun()
getRun(
runId):Promise<EvalRun|undefined>
Defined in: packages/agentos/src/core/evaluation/Evaluator.ts:382
Gets an evaluation run by ID.
Parameters
runId
string
Run ID
Returns
Promise<EvalRun | undefined>
The evaluation run or undefined
Implementation of
listRuns()
listRuns(
limit?):Promise<EvalRun[]>
Defined in: packages/agentos/src/core/evaluation/Evaluator.ts:386
Lists recent evaluation runs.
Parameters
limit?
number = 50
Maximum runs to return
Returns
Promise<EvalRun[]>
Array of runs
Implementation of
registerScorer()
registerScorer(
name,fn):void
Defined in: packages/agentos/src/core/evaluation/Evaluator.ts:378
Registers a custom scorer.
Parameters
name
string
Scorer name
fn
Scoring function
Returns
void
Implementation of
runEvaluation()
runEvaluation(
name,testCases,agentFn,config?):Promise<EvalRun>
Defined in: packages/agentos/src/core/evaluation/Evaluator.ts:220
Runs an evaluation suite against an agent.
Parameters
name
string
Name for this evaluation run
testCases
Test cases to evaluate
agentFn
(input, context?) => Promise<string>
Function that takes input and returns agent output
config?
Evaluation configuration
Returns
Promise<EvalRun>
The completed evaluation run
Implementation of
score()
score(
scorer,actual,expected?,references?):Promise<number>
Defined in: packages/agentos/src/core/evaluation/Evaluator.ts:365
Scores output using a specific scorer.
Parameters
scorer
string
Scorer name
actual
string
Actual output
expected?
string
Expected output
references?
string[]
Reference outputs
Returns
Promise<number>
Score (0-1)