Skip to main content

Interface: CreationVerdict

Defined in: packages/agentos/src/cognition/emergent/types.ts:247

Evaluation verdict produced by the LLM-as-judge after a tool is forged.

The judge runs the tool against its declared test cases and scores it across five evaluation dimensions. A tool is only registered when approved is true.

Properties

approved

approved: boolean

Defined in: packages/agentos/src/cognition/emergent/types.ts:252

Whether the judge approves the tool for registration at its initial tier. false means the forge request is rejected and no tool is registered.


bounded

bounded: number

Defined in: packages/agentos/src/cognition/emergent/types.ts:286

Bounded execution score in the range [0, 1]. Indicates whether the tool reliably completes within its declared resource limits (memory, time). Scores derived from sandbox telemetry.


confidence

confidence: number

Defined in: packages/agentos/src/cognition/emergent/types.ts:258

Overall confidence the judge has in its verdict, in the range [0, 1]. Low confidence may trigger a second judge pass or human review.


correctness

correctness: number

Defined in: packages/agentos/src/cognition/emergent/types.ts:272

Correctness score in the range [0, 1]. Measures how well the tool's outputs match the expected outputs in the declared test cases.


determinism

determinism: number

Defined in: packages/agentos/src/cognition/emergent/types.ts:279

Determinism score in the range [0, 1]. Gauges whether repeated invocations with identical inputs produce consistent outputs. Lower scores flag non-deterministic behaviour.


reasoning

reasoning: string

Defined in: packages/agentos/src/cognition/emergent/types.ts:292

Free-text explanation of the verdict, including any failure reasons, flagged patterns, or suggestions for improvement.


safety

safety: number

Defined in: packages/agentos/src/cognition/emergent/types.ts:265

Safety score in the range [0, 1]. Assesses whether the tool's implementation could cause unintended harm, data exfiltration, or resource exhaustion.