Privacy Architecture

Spooled is content-blind by design. The SDK captures execution structure — which tools were called, in what order, how long they took — not the content of your calls. Prompt text, LLM response text, tool argument values, and tool output values are stripped at the SDK level before storage or transmission.

What is NOT transmitted (call content)

The following are stripped at the SDK level and never leave your infrastructure:

  • Prompt text and system messages
  • LLM response text and completions
  • Tool call argument values
  • Tool output values
  • Error message text
  • HTTP request/response bodies

What IS transmitted (structural metadata)

When a backend is configured, the following structural metadata is transmitted. Do not encode PII in these fields — they are transmitted as-is:

  • Agent ID — the identifier you pass to spooled.init(agent_id="...")
  • Tool function names — e.g., route_ticket, search_documents
  • Output field key names — the field names (not values) from tool response dicts
  • Tags — strings you pass via tags=[...]
  • Session ID and correlation context — if provided
  • Git hash and branch name — from the current repository
  • Timing and token counts — latency per interaction, prompt/completion tokens
  • Behavioral fingerprint hashes — SHA-256 hashes of execution shape
  • Interaction types and sequence — LLM_CALL, TOOL_CALL, etc.
  • Model names — e.g., gpt-4o, claude-3-sonnet
Warning
Agent IDs, tool names, tags, session IDs, and git branch names are transmitted without content filtering. Use descriptive but non-sensitive identifiers. Do not encode customer names, PII, or confidential data in these fields.

Local storage

Local traces in .spooled/traces/ contain the same structural metadata listed above. Content stripping is applied before local storage, not just before transmission.

Note
This is a change from earlier SDK versions. Since v0.3.0, content stripping is unconditional — it applies to local storage as well as backend transmission. There is no configuration that enables content capture.

What's transmitted (structural metadata only)

When a backend is configured (SPOOLED_BACKEND_URL), only structural metadata is sent:

  • Behavioral fingerprints (hash of interaction type + tool name sequence)
  • Interaction types and sequence (LLM_CALL, TOOL_CALL, etc.)
  • Tool/model/endpoint names (not arguments or responses)
  • Latency measurements (per-interaction ms)
  • Token counts (prompt, completion, total)
  • Error types (not error messages)
  • Hash chain values (SHA-256 hashes)
  • Decision distributions (categorical field value counts — not the values themselves)
  • Output schema snapshots (field names only — not field values)
  • Aggregate score statistics (mean, std, min, max — not individual values)
Note
In local-only mode (no SPOOLED_BACKEND_URL), nothing is transmitted at all. Everything stays in .spooled/traces/.

Content-blind signals

Pro-tier signals detect quality changes without reading content:

Output schema drift

Tracks which field names appear in tool outputs. If a field is added or removed, it's detected — without reading the field values. See the full signals reference for the full list.

Redaction: defense in depth

Even in local traces, Spooled applies a redaction engine as a second layer of protection. The RedactionEngine scrubs PII patterns before storage:

PatternExampleReplaced with
Credit cards4111-1111-1111-1111[REDACTED:credit_card]
SSN123-45-6789[REDACTED:ssn]
API keyssk-abc123...[REDACTED:api_key_sk]
Emailuser@example.com[REDACTED:email]
AWS keysAKIA...[REDACTED:aws_access_key]

Sensitive dictionary keys are also redacted: password, secret, api_key, token, authorization, credentials, private_key, ssn, credit_card, cvv, pin.

Verify it yourself

Core content stripping is architectural — prompts, responses, and tool payloads are stripped unconditionally at multiple independent layers. Content stripping is defense-in-depth and cannot be disabled.

# In the recorder, content fields are replaced with
# structural markers before any payload is built.
# Only metadata (types, names, hashes, metrics) is sent.

Additional PII redaction patterns (credit cards, emails, SSNs, etc.) provide a secondary safety layer. These can be individually configured:

SPOOLED_REDACT_CREDIT_CARD=false  # Disable credit card redaction
SPOOLED_REDACT_EMAIL=false        # Disable email redaction
Note
Disabling a PII redaction pattern does not cause content to be transmitted. It only affects local trace files stored on your infrastructure. Core content stripping remains active regardless of these settings.

OpenTelemetry export

When using the optional OTEL exporter, Spooled exports structural span attributes (tool names, latencies, model names) to your configured OTEL backend. Tool argument values are not exported. If you ingest external OTEL spans via the SpooledSpanProcessor, those spans pass through with whatever attributes your instrumentation captures — Spooled does not strip content from externally-produced spans.