Custom Assertions

Custom assertions let you add reusable assertion types that go beyond built-in types. Define a TypeScript function, drop it in .agentv/assertions/, and reference it by name in your YAML eval files.

When to Use Each Approach

AgentV provides two SDK functions for custom evaluation logic:

Function	Best For	Discovery
`defineAssertion()`	Reusable assertion types with pass/fail plus optional score	Convention-based (`.agentv/assertions/`)
`defineScriptGrader()`	Command-backed scorer with full score and assertion-result control	Referenced via `type: script` + `command:`

Use defineAssertion() when you want a named assertion type that can be referenced across eval files without specifying a command path. It uses a simplified result contract focused on pass and optional score.

Use defineScriptGrader() when the scoring component is a command-backed grader: it needs explicit score calculation, custom assertion-result arrays, workspace commands, or LLM calls through a grader target. See Script Graders for details.

Both functions handle stdin/stdout JSON parsing, snake_case-to-camelCase conversion, Zod validation, and error handling automatically.

Promptfoo Terminology

Promptfoo calls normal eval checks assertions. Its custom code paths use fixed assertion types such as javascript, python, ruby, and webhook, and its Node API exposes assertion-oriented helpers such as runAssertion() and runAssertions().

AgentV follows that framing for assert: entries and defineAssertion(). The AgentV extension is convention discovery: any file in .agentv/assertions/ becomes an assertion type name such as word-count or has-citation. Reserve custom grader or script grader wording for command-backed or LLM-backed scoring components, especially type: script entries built with defineScriptGrader().

Installation

npm install @agentv/sdk

Convention-Based Discovery

Place assertion files in .agentv/assertions/ anywhere in your project tree. AgentV walks up from the eval file’s directory to find the nearest .agentv/assertions/ folder.

The filename (without extension) becomes the assertion type name:

.agentv/assertions/min-words.ts   -->  type: min-words
.agentv/assertions/sentiment.ts    -->  type: sentiment
.agentv/assertions/has-citation.ts -->  type: has-citation

Supported file extensions: .ts, .js, .mts, .mjs.

Custom assertion types cannot override built-in types (contains, equals, is-json, etc.). If a filename matches a built-in, it is silently skipped.

Using in YAML

Reference the assertion by type name directly — no command: path needed:

assert:
  - type: min-words
  - type: contains
    value: "Hello"

Pass/Fail Pattern

The simplest pattern returns pass (boolean) and an optional assertions array:

import { defineAssertion } from '@agentv/sdk';

export default defineAssertion(({ output }) => {
  const wordCount = (output ?? '').trim().split(/\s+/).filter(Boolean).length;
  const pass = wordCount >= 3;
  return {
    pass,
    assertions: [{ text: `Output has ${wordCount} words`, passed: pass }],
  };
});

When only pass is provided, the score defaults to 1 (pass) or 0 (fail).

Score Pattern

Return a score (0 to 1) for granular evaluation instead of binary pass/fail:

import { defineAssertion } from '@agentv/sdk';

export default defineAssertion(({ output, traceSummary }) => {
  const hasContent = (output ?? '').length > 0 ? 0.5 : 0;
  const isEfficient = (traceSummary?.eventCount ?? 0) <= 5 ? 0.5 : 0;
  return {
    score: hasContent + isEfficient,
    assertions: [
      { text: 'Has content', passed: hasContent > 0 },
      { text: 'Efficient', passed: isEfficient > 0 },
    ],
  };
});

If pass is omitted but score is provided, pass is derived as score >= 0.5. Scores are clamped to the [0, 1] range.

AssertionScore Contract

The handler must return an AssertionScore object:

Field	Type	Description
`pass`	`boolean`	Explicit pass/fail. If omitted, derived from `score` (>= 0.5 = pass).
`score`	`number`	Numeric score between 0 and 1. Defaults to 1 if `pass=true`, 0 if `pass=false`.
`assertions`	`Array<{ text: string, passed: boolean, evidence?: string }>`	Per-aspect results. Each entry describes one check with its verdict and optional evidence.
`details`	`Record<string, unknown>`	Optional structured data for domain-specific metrics.

Context Available to Assertions

The handler receives an AssertionContext with the same fields as a script grader:

Field	Type	Description
`input`	`Message[]`	Full resolved input messages
`output`	`string \| null`	Final answer / scored result only
`messages`	`Message[]`	Transcript messages from the target execution
`expectedOutput`	`Message[]`	Expected output messages
`criteria`	`string`	Evaluation criteria from the test case
`trace`	`Trace`	Full execution trace with messages, events, metrics, and provenance
`traceSummary`	`TraceSummary`	Lightweight execution metrics summary

The raw stdin payload uses snake_case keys such as expected_output, trace_summary, and workspace_path. defineAssertion() converts them to SDK camelCase fields such as expectedOutput, traceSummary, and workspacePath.

Testing Custom Assertions

Test assertions locally by piping JSON to stdin:

echo '{"input":[{"role":"user","content":"Say hello"}],"input_files":[],"criteria":"Multi-word greeting","output":"Hello there, nice to meet you!","expected_output":[]}' \
  | bun run .agentv/assertions/min-words.ts

Expected output:

{
  "score": 1,
  "assertions": [
    { "text": "Output has 6 words", "passed": true }
  ]
}

For test-driven development, write Vitest tests against your assertion logic directly:

import { expect, test } from 'vitest';

// Extract the core logic into a testable function
function checkWordCount(answer: string) {
  const wordCount = answer.trim().split(/\s+/).length;
  const minWords = 3;
  const pass = wordCount >= minWords;
  return { pass, wordCount };
}

test('passes with enough words', () => {
  const result = checkWordCount('Hello there friend');
  expect(result.pass).toBe(true);
});

test('fails with too few words', () => {
  const result = checkWordCount('Hi');
  expect(result.pass).toBe(false);
});

Full Working Example

This example shows the complete flow from assertion definition to YAML eval file.

1. Project Structure

my-project/
  .agentv/
    assertions/
      min-words.ts
  evals/
    suite.yaml
  package.json

2. Define the Assertion

#!/usr/bin/env bun
import { defineAssertion } from '@agentv/sdk';

export default defineAssertion(({ output }) => {
  const wordCount = (output ?? '').trim().split(/\s+/).filter(Boolean).length;
  const minWords = 3;
  const pass = wordCount >= minWords;

  return {
    pass,
    score: pass ? 1.0 : Math.min(wordCount / minWords, 0.9),
    assertions: [
      {
        text: pass
          ? `Output has ${wordCount} words (>= ${minWords} required)`
          : `Output has only ${wordCount} words (need >= ${minWords})`,
        passed: pass,
      },
    ],
  };
});

3. Reference in YAML

name: custom-assertion-demo
description: Demonstrates custom assertions with convention discovery

target: default

tests:
  - id: greeting-response
    input: "Say hello and introduce yourself"
    expected_output: "Hello! I'm an AI assistant here to help you."
    assert:
      - Agent gives a multi-word greeting
      - type: contains
        value: "Hello"
      - type: min-words

  - id: short-answer
    input: "What is 2+2?"
    expected_output: "The answer is 4."
    assert:
      - Agent gives a short but valid response
      - type: contains
        value: "4"
      - type: min-words

4. Install and Run

npm install @agentv/sdk
agentv eval evals/suite.yaml

Each test produces scores from both the built-in contains assertion and your custom min-words assertion. Results appear in the output JSONL with each grader’s score in the scores[] array.