
adding-benchmarks
by AztecProtocol
SKILL.md
name: adding-benchmarks description: Add new benchmarks to the CI pipeline. Guides through creating benchmark JSON files, integrating with bootstrap.sh, and ensuring proper CI upload via ci3.yml workflow.
Adding Benchmarks
When to Use
Use this skill when:
- Adding new performance benchmarks to a package
- Creating benchmark tests that should be tracked over time
- Integrating existing benchmarks into the CI pipeline
Benchmark System Overview
Benchmarks flow through the system as follows:
- Generation: Each package produces
bench-out/*.bench.jsonfiles - Aggregation:
bench_mergein rootbootstrap.shcombines all files, prefixing names with the package path - Upload: CI caches the merged JSON and GitHub Action uploads to the benchmark dashboard
- Display: Results appear at the dashboard with historical tracking
Live dashboard: https://aztecprotocol.github.io/aztec-packages/bench/?branch=next
How Benchmark Names Work
Name Construction
The final benchmark name combines two parts:
- Package prefix (added automatically by
bench_merge): Based on where the file lives - Local name (what you write in JSON): Your metric identifier
Dashboard Grouping
The dashboard splits names by / to create a collapsible tree. The last segment becomes the chart name, everything before it becomes the group hierarchy.
| Full Name | Group Path | Chart Name |
|---|---|---|
yarn-project/stdlib/Tx/private/getTxHash/avg | yarn-project/stdlib/Tx/private/getTxHash | avg |
yarn-project/kv-store/Map/Individual insertion | yarn-project/kv-store/Map | Individual insertion |
barretenberg/sol/Add2HonkVerifier | barretenberg/sol | Add2HonkVerifier |
Naming Best Practices
Use / to create logical groupings:
[
{"name": "Tx/private/getTxHash/avg", "value": 1.2, "unit": "ms"},
{"name": "Tx/private/getTxHash/p50", "value": 1.1, "unit": "ms"},
{"name": "Tx/public/getTxHash/avg", "value": 2.3, "unit": "ms"}
]
Avoid flat names - they create no hierarchy and are hard to navigate:
[
{"name": "tx_private_gettxhash_avg", "value": 1.2, "unit": "ms"}
]
Common suffixes:
- Timing:
avg,p50,p95,p99,min,max,total - Size:
_opcodes,_gates,memory - Rate:
gasPerSecond,jobs_per_sec
Required JSON Format
All benchmark files must be arrays using the customSmallerIsBetter format:
[
{"name": "category/metric_name", "value": 12345, "unit": "gas"},
{"name": "category/another_metric", "value": 100.5, "unit": "ms"}
]
Rules:
- Must be a JSON array
[...], not an object - Each entry needs
name,value,unit valuemust be numeric (lower is better)- File must end with
.bench.json
Adding a New Benchmark
Step 1: Create the Benchmark
TypeScript (most common):
// my_bench.test.ts
import { Timer } from '@aztec/foundation/timer';
import { writeFile, mkdir } from 'fs/promises';
describe('MyComponent benchmarks', () => {
const results: { name: string; value: number; unit: string }[] = [];
afterAll(async () => {
if (process.env.BENCH_OUTPUT) {
await mkdir(path.dirname(process.env.BENCH_OUTPUT), { recursive: true });
await writeFile(process.env.BENCH_OUTPUT, JSON.stringify(results));
}
});
it('benchmark operation', async () => {
const timer = new Timer();
// ... operation to benchmark ...
results.push({ name: 'MyComponent/operation/avg', value: timer.ms(), unit: 'ms' });
});
});
Shell (jq-based):
mkdir -p bench-out
jq -n '[
{name: "metric1", value: '$VALUE1', unit: "ms"},
{name: "metric2", value: '$VALUE2', unit: "gas"}
]' > bench-out/my-component.bench.json
Python:
import json
benchmark_list = [{"name": "category/metric", "value": 12345, "unit": "gas"}]
with open("bench-out/my-component.bench.json", "w") as f:
json.dump(benchmark_list, f)
Step 2: Register in bootstrap.sh
Add to the package's bench_cmds function:
function bench_cmds {
local hash=$(hash)
echo "$hash BENCH_OUTPUT=bench-out/my_component.bench.json yarn-project/scripts/run_test.sh <package>/src/my_bench.test.ts"
}
Options: :ISOLATE=1, :CPUS=8, :MEM=16g, :TIMEOUT=7200
Step 3: Verify
# Run locally
BENCH_OUTPUT=bench-out/test.bench.json yarn test src/my_bench.test.ts
# Validate JSON
jq . bench-out/test.bench.json
jq 'all(has("name") and has("value") and has("unit"))' bench-out/test.bench.json
CI Details
Benchmarks upload when:
- PR has label:
ci-merge-queue,ci-full, orci-full-no-test-cache(publishes to target branch, i.e.nextor a merge-train branch) - Running on merge queue (publishes with
next)
10-commit visibility window: The dashboard only shows benchmarks that ran in the last 10 commits. If a benchmark stops running, it disappears after ~10 merges.
Reference Implementations
- TypeScript:
yarn-project/stdlib/src/tx/tx_bench.test.ts - Python:
l1-contracts/scripts/generate_benchmark_json.py - Shell:
yarn-project/p2p/testbench/consolidate_benchmarks.sh - Circuits:
noir-projects/noir-protocol-circuits/scripts/run_bench.sh
Score
Total Score
Based on repository quality metrics
SKILL.mdファイルが含まれている
ライセンスが設定されている
100文字以上の説明がある
GitHub Stars 100以上
3ヶ月以内に更新
10回以上フォークされている
オープンIssueが50未満
プログラミング言語が設定されている
1つ以上のタグが設定されている
Reviews
Reviews coming soon