Threat Model

This document describes what Spooled defends against, what it does not, and the three architectural layers that enforce the boundary. Every defended claim links to a test fixture committed in the SDK source tree.

For procurement / compliance reviewers: the corpus referenced below lives in tests/privacy/test_corpus/ inside the spooled-ai package. To inspect, download the source distribution:

pip download spooled-ai --no-deps --no-binary=:all: -d ./spooled-src
tar -xzf ./spooled-src/spooled-ai-*.tar.gz -C ./spooled-src
ls ./spooled-src/spooled-ai-*/tests/privacy/test_corpus/

For deeper review under NDA, contact hello@spooled.ai.

What Spooled defends against

1. Customer-content leakage from LLM calls

Prompts, messages, and LLM responses are stripped to hashes by the auto-instrumentation hooks beforethe recorder is invoked. The recorder never receives raw content — only structural metadata (model name, token counts, finish reason, role) and content hashes.

Architecture: Layer 1 (spooled/hooks/) operates at the HTTP/SDK boundary; Layer 2 (spooled/recorder.py:_strip_to_structural) enforces structural-only persistence even for manually-recorded interactions.

Verification: the hook redaction layer is tested by tests/spooled/test_hook_privacy.py and the structural strip by tests/spooled/test_recorder.py. These predate the corpus and live outside tests/privacy/test_corpus/ because they exercise the SDK hooks rather than the schema-enforcement layer.

2. Accidental PII in developer-controlled metadata

When a developer passes metadata={"user_email": "..."} to record_interaction, the privacy enforcement layer drops the undeclared key in lenient mode (with a structured warning) or raises PrivacyViolationError in strict mode.

The set of allowed metadata keys is the union of (a) Spooled's built-in defaults — common structural signals like tokens_total, latency_ms, intent, model — and (b) keys declared in your repo's .spooled/privacy.yml. The config file is committable and reviewable in pull requests; its git history is your audit trail.

Fixture: pii_in_metadata.json

3. Auth tokens and secrets disguised as benign keys

Default-allowed keys are an explicit allowlist. Creative naming does not bypass it: my_special_token, xtoken, tokenABC are all dropped or rejected because none of them appear in DEFAULT_METADATA_KEYS. The allowlist intentionally permits tokens_total, cached_tokens, and other usage-metering keys so observability still works.

Fixture: auth_tokens_creative_naming.json

4. Unicode evasion of redaction

Unicode homoglyphs (Cyrillic е standing in for Latin e), fullwidth characters, RTL marks, and similar encoding tricks cannot smuggle PII past the allowlist. The allowlist matches by exact string equality — еmail (with Cyrillic е, U+0435) does not match the Latin-defined email, so it is dropped as an undeclared key.

Fixture: unicode_bypass.json

5. PII nested arbitrarily deep in undeclared keys

Enforcement is at the metadata-key boundary, not value-content inspection. A nested structure like {"customer": {"contact": {"email": "..."}}} is dropped because the top-level customerkey is undeclared — Spooled does not need to recurse into untrusted subtrees.

Fixture: nested_dicts.json

6. Encoded PII in undeclared keys

Spooled does not decode base64, URL-encoding, hex, or other transformations to scan their decoded value. It does not need to: a base64-encoded SSN inside an undeclared key (e.g., encoded_ssn) is dropped wholesale because the key is undeclared, regardless of the value's content.

Fixture: encoded_pii.json

7. Tool-call arguments containing PII

Per-tool argument allowlists default to deny. If a tool isn't listed in allowed_tool_arg_keys, none of its arguments persist. The send_email tool carrying recipient_email and customer_data.ssn has all three keys stripped in lenient mode and raises in strict mode.

Fixture: tool_args_smuggling.json

What Spooled does NOT defend against

These boundaries are explicit. Compliance teams should read this section as carefully as the previous one.

  • A developer who deliberately renames a sensitive field to a declared-safe key (e.g., calls a customer email intent). Spooled trusts your schema; it cannot read your mind. Code review is your defense here.
  • A malicious third-party Python package that monkey-patches Recorder.record_interaction to bypass the enforcement layer. This is OS-level capability isolation territory, not a library's responsibility.
  • A compromised LLM provider that returns adversarially-crafted content designed to evade structural stripping. The hook layer hashes content rather than parsing it, so there is no parser to confuse — but if the provider colludes with an attacker to ship PII back as a metadata field, the schema layer catches that (defense #2).
  • Network eavesdroppers on traffic between SDK and backend. TLS is your responsibility. The Spooled backend requires HTTPS; the SDK uses httpx defaults (TLS certificate validation enabled).
  • An attacker who compromises your DynamoDB / S3 backend.Spooled minimizes what's transmitted (hashes for content, structural metadata for everything else), but stored hashes and tool graphs can still reveal behavior patterns(e.g., “this agent calls escalate_to_humanon 30% of runs”). If pattern-level inference is a concern, use the SDK in local-only mode by setting SPOOLED_BACKEND_URL="".
  • Third-party LLM provider retention of your prompts.OpenAI, Anthropic, Bedrock, and similar providers retain their own logs of the raw prompts and responses your agent sends and receives. Spooled never sees that data — but the providers do. Negotiate retention terms with them directly; Spooled cannot defend a boundary it does not control.
  • Lenient-mode warning suppression. Lenient mode emits a structured warning via structlog when an undeclared key is dropped. If your application configures structlog below the WARNING level, the notification is silenced and undeclared keys are dropped without surface signal. Auditors should verify their log pipeline captures WARNING-level events from spooled.privacy.enforcement, or use strict mode in CI to raise instead.

The three architectural layers

LayerModuleEnforcesBypass condition
1. Hook redactionspooled/hooks/LLM message content replaced with {role, content_hash} before recorder sees itA non-auto-instrumented HTTP/LLM client. Mitigation: use explicit wrap_* helpers for custom clients.
2. Structural strippingspooled/recorder.py:_strip_to_structuralAllowlist of structural keys; everything else collapses to {redacted: true, payload_hash}A hook that pre-formats untrusted data to look structural. Mitigation: corpus tests verify hook output shape.
3. Schema + regex enforcementspooled/privacy/, spooled/redaction.pyAllowlist for metadata + per-tool arguments; regex for credential-shaped valuesA field renamed to an allowed key. Mitigation: developer responsibility + reviewable .spooled/privacy.yml.

Strict mode opt-in

Setting SPOOLED_STRICT_PRIVACY=1in the environment flips the enforcement layer from “redact + warn” to “raise on first violation.”

  • Lenient mode (default).Undeclared keys are removed from the persisted trace; a structured warning is logged. Existing code that passes a mix of declared and undeclared keys keeps working — the agent doesn't crash; the warning surfaces the issue in logs.
  • Strict mode. The first undeclared key raises PrivacyViolationError at record_interaction time. The trace is marked failed; the offending interaction is not persisted. Earlier interactions in the same trace remain (their hashes still chain).

Recommended adoption: lenient in dev, strict in CI for compliance-sensitive projects.

- name: Run agent CI
  env:
    SPOOLED_STRICT_PRIVACY: "1"
  run: pytest tests/agents/

Reproducibility — the test corpus

Claims #2–#7 each link to a JSON fixture in tests/privacy/test_corpus/. Claim #1 (content stripping by the hooks) is verified by the SDK-level tests under tests/spooled/— it pre-exists the corpus and operates at a different layer.

Two additional fixtures (redacted_already.json and safe_passthrough.json) are positive cases: they prove the engine does NOT over-redact when inputs are already clean or already declared.

The corpus runner at tests/privacy/test_corpus.py iterates every fixture, calls Recorder.record_interaction with the fixture's input block, and asserts the persisted JSONL matches the expected.lenient and expected.strict blocks.

For deeper coverage, the property-based tests at tests/privacy/test_property_based.py fuzz the boundary with Hypothesis-generated inputs and assert two invariants hold for ANY input:

  • Lenient invariant: persisted metadata keys are always a subset of the allowlist.
  • Strict invariant: the call either succeeds with allowlisted keys, or raises PrivacyViolationError. There is no third outcome.

The default profile runs 50 examples per invariant (~2s); the deep profile (HYPOTHESIS_PROFILE=deep or make test-privacy-deep) runs 2000 examples per invariant (~90s) and is the release gate.

Changelog of policy changes

Material changes to the claims above are recorded here with dates. When a claim is added, weakened, or strengthened, this section is updated in the same commit.

  • 2026-05-23trace.environment now also transmits four installation-shape signals: spooled_version, hooks_active, cli_origin, framework_detected. These describe which SDK version is running, which auto-instrumentation hooks fired, and which agent frameworks are importable in the user's environment. cli_origin is constrained to a known allowlist ({"ci_run"}) at the recorder boundary so arbitrary user-controlled env-var content cannot reach the backend. framework_detected uses importlib.util.find_spec (presence-only; no framework module is imported by the probe).
  • 2026-05-22— Initial public threat model. Three-layer architecture documented; eight-fixture corpus committed (six adversarial + two positive); strict mode introduced (off by default).
Note
Spooled is open about what it defends and what it doesn't. If you find a bypass not described above, report it to hello@spooled.ai. The corpus is committed source; we expect users to test the claims rather than trust them.