Custom Assertions
Custom assertions let you add reusable assertion types that go beyond built-in types. Define a TypeScript function, drop it in .agentv/assertions/, and reference it by name in your YAML eval files.
When to Use Each Approach
Section titled “When to Use Each Approach”AgentV provides two SDK functions for custom evaluation logic:
| Function | Best For | Discovery |
|---|---|---|
defineAssertion() | Reusable assertion types with pass/fail plus optional score | Convention-based (.agentv/assertions/) |
defineScriptGrader() | Command-backed scorer with full score and assertion-result control | Referenced via type: script + command: |
Use defineAssertion() when you want a named assertion type that can be referenced across eval files without specifying a command path. It uses a simplified result contract focused on pass and optional score.
Use defineScriptGrader() when the scoring component is a command-backed grader: it needs explicit score calculation, custom assertion-result arrays, workspace commands, or LLM calls through a grader target. See Script Graders for details.
Both functions handle stdin/stdout JSON parsing, snake_case-to-camelCase conversion, Zod validation, and error handling automatically.
Promptfoo Terminology
Section titled “Promptfoo Terminology”Promptfoo calls normal eval checks assertions. Its custom code paths use fixed assertion types such as javascript, python, ruby, and webhook, and its Node API exposes assertion-oriented helpers such as runAssertion() and runAssertions().
AgentV follows that framing for assert: entries and defineAssertion(). The AgentV extension is convention discovery: any file in .agentv/assertions/ becomes an assertion type name such as word-count or has-citation. Reserve custom grader or script grader wording for command-backed or LLM-backed scoring components, especially type: script entries built with defineScriptGrader().
Installation
Section titled “Installation”npm install @agentv/sdkConvention-Based Discovery
Section titled “Convention-Based Discovery”Place assertion files in .agentv/assertions/ anywhere in your project tree. AgentV walks up from the eval file’s directory to find the nearest .agentv/assertions/ folder.
The filename (without extension) becomes the assertion type name:
.agentv/assertions/min-words.ts --> type: min-words.agentv/assertions/sentiment.ts --> type: sentiment.agentv/assertions/has-citation.ts --> type: has-citationSupported file extensions: .ts, .js, .mts, .mjs.
Custom assertion types cannot override built-in types (contains, equals, is-json, etc.). If a filename matches a built-in, it is silently skipped.
Using in YAML
Section titled “Using in YAML”Reference the assertion by type name directly — no command: path needed:
assert: - type: min-words - type: contains value: "Hello"Pass/Fail Pattern
Section titled “Pass/Fail Pattern”The simplest pattern returns pass (boolean) and an optional assertions array:
import { defineAssertion } from '@agentv/sdk';
export default defineAssertion(({ output }) => { const wordCount = (output ?? '').trim().split(/\s+/).filter(Boolean).length; const pass = wordCount >= 3; return { pass, assertions: [{ text: `Output has ${wordCount} words`, passed: pass }], };});When only pass is provided, the score defaults to 1 (pass) or 0 (fail).
Score Pattern
Section titled “Score Pattern”Return a score (0 to 1) for granular evaluation instead of binary pass/fail:
import { defineAssertion } from '@agentv/sdk';
export default defineAssertion(({ output, traceSummary }) => { const hasContent = (output ?? '').length > 0 ? 0.5 : 0; const isEfficient = (traceSummary?.eventCount ?? 0) <= 5 ? 0.5 : 0; return { score: hasContent + isEfficient, assertions: [ { text: 'Has content', passed: hasContent > 0 }, { text: 'Efficient', passed: isEfficient > 0 }, ], };});If pass is omitted but score is provided, pass is derived as score >= 0.5. Scores are clamped to the [0, 1] range.
AssertionScore Contract
Section titled “AssertionScore Contract”The handler must return an AssertionScore object:
| Field | Type | Description |
|---|---|---|
pass | boolean | Explicit pass/fail. If omitted, derived from score (>= 0.5 = pass). |
score | number | Numeric score between 0 and 1. Defaults to 1 if pass=true, 0 if pass=false. |
assertions | Array<{ text: string, passed: boolean, evidence?: string }> | Per-aspect results. Each entry describes one check with its verdict and optional evidence. |
details | Record<string, unknown> | Optional structured data for domain-specific metrics. |
Context Available to Assertions
Section titled “Context Available to Assertions”The handler receives an AssertionContext with the same fields as a script grader:
| Field | Type | Description |
|---|---|---|
input | Message[] | Full resolved input messages |
output | string | null | Final answer / scored result only |
messages | Message[] | Transcript messages from the target execution |
expectedOutput | Message[] | Expected output messages |
criteria | string | Evaluation criteria from the test case |
trace | Trace | Full execution trace with messages, events, metrics, and provenance |
traceSummary | TraceSummary | Lightweight execution metrics summary |
The raw stdin payload uses snake_case keys such as expected_output, trace_summary, and workspace_path. defineAssertion() converts them to SDK camelCase fields such as expectedOutput, traceSummary, and workspacePath.
Testing Custom Assertions
Section titled “Testing Custom Assertions”Test assertions locally by piping JSON to stdin:
echo '{"input":[{"role":"user","content":"Say hello"}],"input_files":[],"criteria":"Multi-word greeting","output":"Hello there, nice to meet you!","expected_output":[]}' \ | bun run .agentv/assertions/min-words.tsExpected output:
{ "score": 1, "assertions": [ { "text": "Output has 6 words", "passed": true } ]}For test-driven development, write Vitest tests against your assertion logic directly:
import { expect, test } from 'vitest';
// Extract the core logic into a testable functionfunction checkWordCount(answer: string) { const wordCount = answer.trim().split(/\s+/).length; const minWords = 3; const pass = wordCount >= minWords; return { pass, wordCount };}
test('passes with enough words', () => { const result = checkWordCount('Hello there friend'); expect(result.pass).toBe(true);});
test('fails with too few words', () => { const result = checkWordCount('Hi'); expect(result.pass).toBe(false);});Full Working Example
Section titled “Full Working Example”This example shows the complete flow from assertion definition to YAML eval file.
1. Project Structure
Section titled “1. Project Structure”my-project/ .agentv/ assertions/ min-words.ts evals/ suite.yaml package.json2. Define the Assertion
Section titled “2. Define the Assertion”#!/usr/bin/env bunimport { defineAssertion } from '@agentv/sdk';
export default defineAssertion(({ output }) => { const wordCount = (output ?? '').trim().split(/\s+/).filter(Boolean).length; const minWords = 3; const pass = wordCount >= minWords;
return { pass, score: pass ? 1.0 : Math.min(wordCount / minWords, 0.9), assertions: [ { text: pass ? `Output has ${wordCount} words (>= ${minWords} required)` : `Output has only ${wordCount} words (need >= ${minWords})`, passed: pass, }, ], };});3. Reference in YAML
Section titled “3. Reference in YAML”name: custom-assertion-demodescription: Demonstrates custom assertions with convention discovery
target: default
tests: - id: greeting-response input: "Say hello and introduce yourself" expected_output: "Hello! I'm an AI assistant here to help you." assert: - Agent gives a multi-word greeting - type: contains value: "Hello" - type: min-words
- id: short-answer input: "What is 2+2?" expected_output: "The answer is 4." assert: - Agent gives a short but valid response - type: contains value: "4" - type: min-words4. Install and Run
Section titled “4. Install and Run”npm install @agentv/sdkagentv eval evals/suite.yamlEach test produces scores from both the built-in contains assertion and your custom min-words assertion. Results appear in the output JSONL with each grader’s score in the scores[] array.