Turn your traces, codebase, and judgment into a system that continuously improves your agent.
Add the Simforge MCP, paste our prompt — your coding agent instruments everything using the setup_simforge tool call.
Read the docs →Search traces, guide analysis, and train graders through conversation.
Analyze traces for the capabilities grader
Issues in 13/30 traces:
Trained and go live automatically.
Graders run on production traces and efficiently consolidate failures into actionable summaries. Coding agents can access these via MCP for fast iteration loops.
Replay recorded traces against a new version of your agent and compare outcomes side by side. Ship changes confidently with grader-backed evidence.
When a grader mislabels a trace, correct it by chatting or with one click. The grader retrains immediately, getting smarter with every correction.
Tell Simforge what matters. It turns your standards into graders, fixes, and feedback loops — with minimal effort from you.