Methods
judge
- judge(input, actualOutput, expectedOutput?, criteria?): Promise<JudgmentResult>
Parameters
- input: string
- actualOutput: string
Optional expectedOutput: stringOptional criteria: JudgeCriteria[]
compare
- compare(input, outputA, outputB, criteria?): Promise<{
winner: "A" | "B" | "tie";
scoreA: number;
scoreB: number;
reasoning: string;
}> Parameters
- input: string
- outputA: string
- outputB: string
Optional criteria: JudgeCriteria[]
Returns Promise<{
winner: "A" | "B" | "tie";
scoreA: number;
scoreB: number;
reasoning: string;
}>
batchJudge
- batchJudge(evaluations, criteria?, concurrency?): Promise<JudgmentResult[]>
Parameters
- evaluations: {
input: string;
actualOutput: string;
expectedOutput?: string;
}[] Optional criteria: JudgeCriteria[]- concurrency: number = 3
LLM-based judge for semantic evaluation