Built-in signals.
Zero assertions to write.
Spooled detects behavioral changes automatically from execution structure. No test code to maintain. Content-blind signals work without reading your prompts, responses, or customer data.
new_behavior_patterninfoCold start: fingerprint doesn't match any known intent. The system is learning a new execution path.
new_side_effectsmediumUnauthorized tools detected — tools appearing in the current run that weren't in the baseline.
latency_spikeshighAverage or max latency exceeds baseline bounds. Triggers on >50% increase or when max exceeds 2× the p95 envelope.
retry_explosionshighExcessive consecutive retries of the same tool after errors. Detects genuine retry loops, not pagination or batch patterns.
error_increaseshighError rate increased significantly versus baseline. Catches agents that silently start failing more often.
tool_usage_changesmediumTool call count changed by more than 50%. Detects over-calling or under-calling of expected tools.
token_usage_spikehighToken consumption increased by more than 50%. Catches prompt bloat, unnecessary context, or model output changes.
component_latency_driftmediumLatency increased for a specific component — llm:gpt-4, tool:search, http:api.example.com. Pinpoints the source.
tool_overusemediumA tool is being called more than necessary. Detects redundant or circular tool invocations.
retrieval_regressionhighRAG retrieval quality degraded. Monitors retrieve, vector_search, search, and query tools for precision changes.
output_schema_drifthighProOutput field schema changed — fields added or removed from tool responses. Catches breaking API contract changes.