Skip to content

Custom Assertions

Custom assertions let you add reusable assertion types that go beyond built-in types. Define a TypeScript function, drop it in .agentv/assertions/, and reference it by name in your YAML eval files.

AgentV provides two SDK functions for custom evaluation logic:

FunctionBest ForDiscovery
defineAssertion()Reusable assertion types with pass/fail plus optional scoreConvention-based (.agentv/assertions/)
defineScriptGrader()Command-backed scorer with full score and assertion-result controlReferenced via type: script + command:

Use defineAssertion() when you want a named assertion type that can be referenced across eval files without specifying a command path. It uses a simplified result contract focused on pass and optional score.

Use defineScriptGrader() when the scoring component is a command-backed grader: it needs explicit score calculation, custom assertion-result arrays, workspace commands, or LLM calls through a grader target. See Script Graders for details.

Both functions handle stdin/stdout JSON parsing, snake_case-to-camelCase conversion, Zod validation, and error handling automatically.

Promptfoo calls normal eval checks assertions. Its custom code paths use fixed assertion types such as javascript, python, ruby, and webhook, and its Node API exposes assertion-oriented helpers such as runAssertion() and runAssertions().

AgentV follows that framing for assert: entries and defineAssertion(). The AgentV extension is convention discovery: any file in .agentv/assertions/ becomes an assertion type name such as word-count or has-citation. Reserve custom grader or script grader wording for command-backed or LLM-backed scoring components, especially type: script entries built with defineScriptGrader().

Terminal window
npm install @agentv/sdk

Place assertion files in .agentv/assertions/ anywhere in your project tree. AgentV walks up from the eval file’s directory to find the nearest .agentv/assertions/ folder.

The filename (without extension) becomes the assertion type name:

.agentv/assertions/min-words.ts --> type: min-words
.agentv/assertions/sentiment.ts --> type: sentiment
.agentv/assertions/has-citation.ts --> type: has-citation

Supported file extensions: .ts, .js, .mts, .mjs.

Custom assertion types cannot override built-in types (contains, equals, is-json, etc.). If a filename matches a built-in, it is silently skipped.

Reference the assertion by type name directly — no command: path needed:

assert:
- type: min-words
- type: contains
value: "Hello"

The simplest pattern returns pass (boolean) and an optional assertions array:

.agentv/assertions/min-words.ts
import { defineAssertion } from '@agentv/sdk';
export default defineAssertion(({ output }) => {
const wordCount = (output ?? '').trim().split(/\s+/).filter(Boolean).length;
const pass = wordCount >= 3;
return {
pass,
assertions: [{ text: `Output has ${wordCount} words`, passed: pass }],
};
});

When only pass is provided, the score defaults to 1 (pass) or 0 (fail).

Return a score (0 to 1) for granular evaluation instead of binary pass/fail:

.agentv/assertions/efficiency.ts
import { defineAssertion } from '@agentv/sdk';
export default defineAssertion(({ output, traceSummary }) => {
const hasContent = (output ?? '').length > 0 ? 0.5 : 0;
const isEfficient = (traceSummary?.eventCount ?? 0) <= 5 ? 0.5 : 0;
return {
score: hasContent + isEfficient,
assertions: [
{ text: 'Has content', passed: hasContent > 0 },
{ text: 'Efficient', passed: isEfficient > 0 },
],
};
});

If pass is omitted but score is provided, pass is derived as score >= 0.5. Scores are clamped to the [0, 1] range.

The handler must return an AssertionScore object:

FieldTypeDescription
passbooleanExplicit pass/fail. If omitted, derived from score (>= 0.5 = pass).
scorenumberNumeric score between 0 and 1. Defaults to 1 if pass=true, 0 if pass=false.
assertionsArray<{ text: string, passed: boolean, evidence?: string }>Per-aspect results. Each entry describes one check with its verdict and optional evidence.
detailsRecord<string, unknown>Optional structured data for domain-specific metrics.

The handler receives an AssertionContext with the same fields as a script grader:

FieldTypeDescription
inputMessage[]Full resolved input messages
outputstring | nullFinal answer / scored result only
messagesMessage[]Transcript messages from the target execution
expectedOutputMessage[]Expected output messages
criteriastringEvaluation criteria from the test case
traceTraceFull execution trace with messages, events, metrics, and provenance
traceSummaryTraceSummaryLightweight execution metrics summary

The raw stdin payload uses snake_case keys such as expected_output, trace_summary, and workspace_path. defineAssertion() converts them to SDK camelCase fields such as expectedOutput, traceSummary, and workspacePath.

Test assertions locally by piping JSON to stdin:

Terminal window
echo '{"input":[{"role":"user","content":"Say hello"}],"input_files":[],"criteria":"Multi-word greeting","output":"Hello there, nice to meet you!","expected_output":[]}' \
| bun run .agentv/assertions/min-words.ts

Expected output:

{
"score": 1,
"assertions": [
{ "text": "Output has 6 words", "passed": true }
]
}

For test-driven development, write Vitest tests against your assertion logic directly:

.agentv/assertions/__tests__/min-words.test.ts
import { expect, test } from 'vitest';
// Extract the core logic into a testable function
function checkWordCount(answer: string) {
const wordCount = answer.trim().split(/\s+/).length;
const minWords = 3;
const pass = wordCount >= minWords;
return { pass, wordCount };
}
test('passes with enough words', () => {
const result = checkWordCount('Hello there friend');
expect(result.pass).toBe(true);
});
test('fails with too few words', () => {
const result = checkWordCount('Hi');
expect(result.pass).toBe(false);
});

This example shows the complete flow from assertion definition to YAML eval file.

my-project/
.agentv/
assertions/
min-words.ts
evals/
suite.yaml
package.json
.agentv/assertions/min-words.ts
#!/usr/bin/env bun
import { defineAssertion } from '@agentv/sdk';
export default defineAssertion(({ output }) => {
const wordCount = (output ?? '').trim().split(/\s+/).filter(Boolean).length;
const minWords = 3;
const pass = wordCount >= minWords;
return {
pass,
score: pass ? 1.0 : Math.min(wordCount / minWords, 0.9),
assertions: [
{
text: pass
? `Output has ${wordCount} words (>= ${minWords} required)`
: `Output has only ${wordCount} words (need >= ${minWords})`,
passed: pass,
},
],
};
});
evals/suite.yaml
name: custom-assertion-demo
description: Demonstrates custom assertions with convention discovery
target: default
tests:
- id: greeting-response
input: "Say hello and introduce yourself"
expected_output: "Hello! I'm an AI assistant here to help you."
assert:
- Agent gives a multi-word greeting
- type: contains
value: "Hello"
- type: min-words
- id: short-answer
input: "What is 2+2?"
expected_output: "The answer is 4."
assert:
- Agent gives a short but valid response
- type: contains
value: "4"
- type: min-words
Terminal window
npm install @agentv/sdk
agentv eval evals/suite.yaml

Each test produces scores from both the built-in contains assertion and your custom min-words assertion. Results appear in the output JSONL with each grader’s score in the scores[] array.