Threat Model
This document describes what Spooled defends against, what it does not, and the three architectural layers that enforce the boundary. Every defended claim links to a test fixture committed in the SDK source tree.
For procurement / compliance reviewers: the corpus referenced below lives in tests/privacy/test_corpus/ inside the spooled-ai package. To inspect, download the source distribution:
pip download spooled-ai --no-deps --no-binary=:all: -d ./spooled-src tar -xzf ./spooled-src/spooled-ai-*.tar.gz -C ./spooled-src ls ./spooled-src/spooled-ai-*/tests/privacy/test_corpus/
For deeper review under NDA, contact hello@spooled.ai.
What Spooled defends against
1. Customer-content leakage from LLM calls
Prompts, messages, and LLM responses are stripped to hashes by the auto-instrumentation hooks beforethe recorder is invoked. The recorder never receives raw content — only structural metadata (model name, token counts, finish reason, role) and content hashes.
Architecture: Layer 1 (spooled/hooks/) operates at the HTTP/SDK boundary; Layer 2 (spooled/recorder.py:_strip_to_structural) enforces structural-only persistence even for manually-recorded interactions.
Verification: the hook redaction layer is tested by tests/spooled/test_hook_privacy.py and the structural strip by tests/spooled/test_recorder.py. These predate the corpus and live outside tests/privacy/test_corpus/ because they exercise the SDK hooks rather than the schema-enforcement layer.
2. Accidental PII in developer-controlled metadata
When a developer passes metadata={"user_email": "..."} to record_interaction, the privacy enforcement layer drops the undeclared key in lenient mode (with a structured warning) or raises PrivacyViolationError in strict mode.
The set of allowed metadata keys is the union of (a) Spooled's built-in defaults — common structural signals like tokens_total, latency_ms, intent, model — and (b) keys declared in your repo's .spooled/privacy.yml. The config file is committable and reviewable in pull requests; its git history is your audit trail.
Fixture: pii_in_metadata.json
3. Auth tokens and secrets disguised as benign keys
Default-allowed keys are an explicit allowlist. Creative naming does not bypass it: my_special_token, xtoken, tokenABC are all dropped or rejected because none of them appear in DEFAULT_METADATA_KEYS. The allowlist intentionally permits tokens_total, cached_tokens, and other usage-metering keys so observability still works.
Fixture: auth_tokens_creative_naming.json
4. Unicode evasion of redaction
Unicode homoglyphs (Cyrillic е standing in for Latin e), fullwidth characters, RTL marks, and similar encoding tricks cannot smuggle PII past the allowlist. The allowlist matches by exact string equality — еmail (with Cyrillic е, U+0435) does not match the Latin-defined email, so it is dropped as an undeclared key.
Fixture: unicode_bypass.json
5. PII nested arbitrarily deep in undeclared keys
Enforcement is at the metadata-key boundary, not value-content inspection. A nested structure like {"customer": {"contact": {"email": "..."}}} is dropped because the top-level customerkey is undeclared — Spooled does not need to recurse into untrusted subtrees.
Fixture: nested_dicts.json
6. Encoded PII in undeclared keys
Spooled does not decode base64, URL-encoding, hex, or other transformations to scan their decoded value. It does not need to: a base64-encoded SSN inside an undeclared key (e.g., encoded_ssn) is dropped wholesale because the key is undeclared, regardless of the value's content.
Fixture: encoded_pii.json
7. Tool-call arguments containing PII
Per-tool argument allowlists default to deny. If a tool isn't listed in allowed_tool_arg_keys, none of its arguments persist. The send_email tool carrying recipient_email and customer_data.ssn has all three keys stripped in lenient mode and raises in strict mode.
Fixture: tool_args_smuggling.json
What Spooled does NOT defend against
These boundaries are explicit. Compliance teams should read this section as carefully as the previous one.
- A developer who deliberately renames a sensitive field to a declared-safe key (e.g., calls a customer email
intent). Spooled trusts your schema; it cannot read your mind. Code review is your defense here. - A malicious third-party Python package that monkey-patches
Recorder.record_interactionto bypass the enforcement layer. This is OS-level capability isolation territory, not a library's responsibility. - A compromised LLM provider that returns adversarially-crafted content designed to evade structural stripping. The hook layer hashes content rather than parsing it, so there is no parser to confuse — but if the provider colludes with an attacker to ship PII back as a metadata field, the schema layer catches that (defense #2).
- Network eavesdroppers on traffic between SDK and backend. TLS is your responsibility. The Spooled backend requires HTTPS; the SDK uses httpx defaults (TLS certificate validation enabled).
- An attacker who compromises your DynamoDB / S3 backend.Spooled minimizes what's transmitted (hashes for content, structural metadata for everything else), but stored hashes and tool graphs can still reveal behavior patterns(e.g., “this agent calls
escalate_to_humanon 30% of runs”). If pattern-level inference is a concern, use the SDK in local-only mode by settingSPOOLED_BACKEND_URL="". - Third-party LLM provider retention of your prompts.OpenAI, Anthropic, Bedrock, and similar providers retain their own logs of the raw prompts and responses your agent sends and receives. Spooled never sees that data — but the providers do. Negotiate retention terms with them directly; Spooled cannot defend a boundary it does not control.
- Lenient-mode warning suppression. Lenient mode emits a structured warning via
structlogwhen an undeclared key is dropped. If your application configuresstructlogbelow the WARNING level, the notification is silenced and undeclared keys are dropped without surface signal. Auditors should verify their log pipeline captures WARNING-level events fromspooled.privacy.enforcement, or use strict mode in CI to raise instead.
The three architectural layers
| Layer | Module | Enforces | Bypass condition |
|---|---|---|---|
| 1. Hook redaction | spooled/hooks/ | LLM message content replaced with {role, content_hash} before recorder sees it | A non-auto-instrumented HTTP/LLM client. Mitigation: use explicit wrap_* helpers for custom clients. |
| 2. Structural stripping | spooled/recorder.py:_strip_to_structural | Allowlist of structural keys; everything else collapses to {redacted: true, payload_hash} | A hook that pre-formats untrusted data to look structural. Mitigation: corpus tests verify hook output shape. |
| 3. Schema + regex enforcement | spooled/privacy/, spooled/redaction.py | Allowlist for metadata + per-tool arguments; regex for credential-shaped values | A field renamed to an allowed key. Mitigation: developer responsibility + reviewable .spooled/privacy.yml. |
Strict mode opt-in
Setting SPOOLED_STRICT_PRIVACY=1in the environment flips the enforcement layer from “redact + warn” to “raise on first violation.”
- Lenient mode (default).Undeclared keys are removed from the persisted trace; a structured warning is logged. Existing code that passes a mix of declared and undeclared keys keeps working — the agent doesn't crash; the warning surfaces the issue in logs.
- Strict mode. The first undeclared key raises
PrivacyViolationErroratrecord_interactiontime. The trace is marked failed; the offending interaction is not persisted. Earlier interactions in the same trace remain (their hashes still chain).
Recommended adoption: lenient in dev, strict in CI for compliance-sensitive projects.
- name: Run agent CI env: SPOOLED_STRICT_PRIVACY: "1" run: pytest tests/agents/
Reproducibility — the test corpus
Claims #2–#7 each link to a JSON fixture in tests/privacy/test_corpus/. Claim #1 (content stripping by the hooks) is verified by the SDK-level tests under tests/spooled/— it pre-exists the corpus and operates at a different layer.
Two additional fixtures (redacted_already.json and safe_passthrough.json) are positive cases: they prove the engine does NOT over-redact when inputs are already clean or already declared.
The corpus runner at tests/privacy/test_corpus.py iterates every fixture, calls Recorder.record_interaction with the fixture's input block, and asserts the persisted JSONL matches the expected.lenient and expected.strict blocks.
For deeper coverage, the property-based tests at tests/privacy/test_property_based.py fuzz the boundary with Hypothesis-generated inputs and assert two invariants hold for ANY input:
- Lenient invariant: persisted metadata keys are always a subset of the allowlist.
- Strict invariant: the call either succeeds with allowlisted keys, or raises
PrivacyViolationError. There is no third outcome.
The default profile runs 50 examples per invariant (~2s); the deep profile (HYPOTHESIS_PROFILE=deep or make test-privacy-deep) runs 2000 examples per invariant (~90s) and is the release gate.
Changelog of policy changes
Material changes to the claims above are recorded here with dates. When a claim is added, weakened, or strengthened, this section is updated in the same commit.
- 2026-05-23 —
trace.environmentnow also transmits four installation-shape signals:spooled_version,hooks_active,cli_origin,framework_detected. These describe which SDK version is running, which auto-instrumentation hooks fired, and which agent frameworks are importable in the user's environment.cli_originis constrained to a known allowlist ({"ci_run"}) at the recorder boundary so arbitrary user-controlled env-var content cannot reach the backend.framework_detectedusesimportlib.util.find_spec(presence-only; no framework module is imported by the probe). - 2026-05-22— Initial public threat model. Three-layer architecture documented; eight-fixture corpus committed (six adversarial + two positive); strict mode introduced (off by default).
hello@spooled.ai. The corpus is committed source; we expect users to test the claims rather than trust them.