Automatic failure analysis
and fixes for your AI agents

Turn your traces, codebase, and judgment into a system that continuously improves your agent.

One shot integration using the MCP

Add the Simforge MCP, paste our prompt — your coding agent instruments everything using the setup_simforge tool call.

Read the docs →
setup_simforgeMCP Tool
Cursor
Claude Code
Codex

Train graders by talking to your traces

Search traces, guide analysis, and train graders through conversation.

Eval Assistant

Analyze traces for the capabilities grader

Filter & Analyze TracesDone

Issues in 13/30 traces:

Data Access5 (17%)
Shallow Analysis8 (27%)
Train
Review & Train Graders

Trained and go live automatically.

Proactive Tool Usage0 · 2 fail
User Guidance3 · 1 fail
Data AccessNEW
Cancel

Get actionable analysis and fixes

Graders run on production traces and efficiently consolidate failures into actionable summaries. Coding agents can access these via MCP for fast iteration loops.

Response Conciseness16d ago
71%pass rate14 eval · 30d
Root Causes
Repetitive StructureExcessive FillerOver-explanation
MCP
get_failure_context()MCP Tool
{
"grader": "Response Conciseness",
"pass_rate": 0.71,
"failures": 4,
"root_causes": [
"Repetitive Structure",
"Excessive Filler"
],
"suggestions": [
"Enforce length constraints",
"Consolidate follow-ups"
]
}

Keep improving, automatically

A/B test agent changes with replay

Replay recorded traces against a new version of your agent and compare outcomes side by side. Ship changes confidently with grader-backed evidence.

v1 · current
72% pass
v2 · candidate
91% pass

Tune graders quickly as your agent evolves

When a grader mislabels a trace, correct it by chatting or with one click. The grader retrains immediately, getting smarter with every correction.

FAILtrace_8f2a · grader disagrees
PASStrace_8f2a · retrained & correctUpdated

Codify your judgment. Improve your agents continuously.

Tell Simforge what matters. It turns your standards into graders, fixes, and feedback loops — with minimal effort from you.