Interface EvalTestResult

Result of a single test case evaluation.

interface EvalTestResult {
    testCaseId: string;
    testCaseName: string;
    passed: boolean;
    score: number;
    metrics: MetricValue[];
    actualOutput: string;
    expectedOutput?: string;
    latencyMs: number;
    tokenUsage?: {
        promptTokens: number;
        completionTokens: number;
        totalTokens: number;
    };
    costUsd?: number;
    error?: string;
    timestamp: string;
}

Properties

testCaseId: string

Test case ID

testCaseName: string

Test case name

passed: boolean

Whether the test passed

score: number

Overall score (0-1)

metrics: MetricValue[]

Individual metric scores

actualOutput: string

Actual agent output

expectedOutput?: string

Expected output

latencyMs: number

Latency in ms

tokenUsage?: {
    promptTokens: number;
    completionTokens: number;
    totalTokens: number;
}

Token usage

Type declaration

  • promptTokens: number
  • completionTokens: number
  • totalTokens: number
costUsd?: number

Estimated cost

error?: string

Error if any

timestamp: string

Timestamp