Configure a PII filter

Detect and act on PII in chat input or memory writes per agent.

The goal

An agent that automatically redacts (or blocks) sensitive content before it reaches the model or memory store, with every hit logged for audit.

Open the safety tab.

On the agent, Safety -> "PII filters".
Pick categories.

Toggle defaults: email, phone, credit_card, ssn, api_key, iban. Per category, choose policy:
- flag: pass through, emit an event for review.
- redact: mask the match in the model input or memory write.
- block: refuse the turn or memory write entirely.
Add custom regex (optional).
- Name: internal_id.
- Pattern: \bACME-\d{6}\b.
- Policy: redact.
Save.
Test.

In the chat panel, send a message containing a fake match (e.g. an email like test@example.com). Watch the response: redacted content appears as [REDACTED:email]. The safety event card lights up on the governance page.

A turn with a matched pattern triggers a safety.event webhook.
The governance page (/agent-governance) lists the event with category, policy, and decrypted match.
Audit shows the redacted-match log row.

Subscribe to safety.event. Filter by category: credit_card and policy block; route to your incident channel.

US phone regex misses international formats. Add custom patterns.
block policy throws a turn error users see; use redact for first-party data the agent must acknowledge.
The filter runs on input, not output. Output filtering is configured separately on the agent's outputFilters.