Filter Strategies

A filter strategy controls what happens to a detected PII token. Each identifier type supports a Strategies list; the first strategy whose condition evaluates to true is applied. If the list is empty, the default REDACT strategy is used.

Available Strategies

Strategy	Constant	Description
`REDACT`	`AbstractFilterStrategy.Redact`	Replace the token with a formatted redaction label
`RANDOM_REPLACE`	`AbstractFilterStrategy.RandomReplace`	Replace with a realistic, type-appropriate fake value
`STATIC_REPLACE`	`AbstractFilterStrategy.StaticReplace`	Replace with a fixed string
`CRYPTO_REPLACE`	`AbstractFilterStrategy.CryptoReplace`	Replace with AES-GCM encrypted ciphertext
`FPE_ENCRYPT_REPLACE`	`AbstractFilterStrategy.FpeEncryptReplace`	Format-preserving encryption (FF3-1)
`HASH_SHA256_REPLACE`	`AbstractFilterStrategy.HashSha256Replace`	Replace with the SHA-256 hex digest
`LAST_4`	`AbstractFilterStrategy.Last4`	Keep only the last 4 characters
`MASK`	`AbstractFilterStrategy.Mask`	Overwrite characters with a mask character
`ABBREVIATE`	`AbstractFilterStrategy.Abbreviate`	Reduce the token to the initials of its words
`MAP_REPLACE`	`AbstractFilterStrategy.MapReplace`	Replace from a lookup table, then a generator, then a fallback strategy
`SAME`	`AbstractFilterStrategy.Same`	Leave the token unchanged (mark as detected but not replaced)
`TRUNCATE`	`AbstractFilterStrategy.Truncate`	Keep only the first character
`SHIFT_DATE`	`AbstractFilterStrategy.ShiftDate`	Shift a detected date by a configurable offset (date filters only)

Strategy Details

REDACT

Replaces the token with a formatted label. The redactionFormat string may contain:

%t — replaced with the filter type name (e.g. ssn, email-address)
%l — replaced with the token's classification label (if any)

Default format: {{{REDACTED-%t}}}

new SsnFilterStrategy
{
    Strategy = "REDACT",
    RedactionFormat = "[REMOVED-%t]"
}

{ "strategy": "REDACT", "redactionFormat": "[REMOVED-%t]" }

RANDOM_REPLACE

Replaces the token with a realistic, type-appropriate fake value generated by the anonymization service for the filter type (for example, a fake SSN-shaped value for an SSN, a fake date for a date). When no anonymization service is wired in (for strategies constructed directly, outside FilterService), it falls back to a random GUID.

new SsnFilterStrategy
{
    Strategy = "RANDOM_REPLACE"
}

Replacement scope — replacementScope controls whether a token's replacement is reused:

`replacementScope`	Behaviour
`"DOCUMENT"` (default)	Each occurrence is anonymized independently.
`"CONTEXT"`	A token's replacement is reused across the context, so the same input value always maps to the same fake value (referential integrity).

new SsnFilterStrategy
{
    Strategy = "RANDOM_REPLACE",
    ReplacementScope = "CONTEXT"   // reuse the same fake value for the same SSN
}

Choosing the fake values — by default RANDOM_REPLACE generates realistic values. You can override this:

Property	JSON key	Description
`anonymizationCandidates`	`anonymizationCandidates`	When non-empty, replacements are drawn from this explicit list of values instead of being generated.
`anonymizationMethod`	`anonymizationMethod`	The generation method when no candidates are supplied (defaults to realistic generation).

new FirstNameFilterStrategy
{
    Strategy = "RANDOM_REPLACE",
    AnonymizationCandidates = new List<string> { "Alex", "Jordan", "Riley" }
}

See Context Service for details on referential integrity with CONTEXT scope.

STATIC_REPLACE

Replaces the token with a fixed string supplied in staticReplacement. Falls back to REDACT format if staticReplacement is empty.

new EmailAddressFilterStrategy
{
    Strategy = "STATIC_REPLACE",
    StaticReplacement = "user@redacted.invalid"
}

{ "strategy": "STATIC_REPLACE", "staticReplacement": "user@redacted.invalid" }

CRYPTO_REPLACE

Encrypts the token with AES-GCM and replaces it with the ciphertext wrapped in double braces, e.g. {{<base64>}} (the Base64 payload is nonce || ciphertext || tag). Requires a Crypto block on the Policy whose key is a hex-encoded 16, 24, or 32-byte AES key. Falls back to REDACT if the policy has no Crypto configuration.

var policy = new Policy
{
    Name = "encrypted",
    Crypto = new Crypto
    {
        // Hex-encoded AES key (32 hex chars = 16 bytes, 48 = 24, 64 = 32).
        Key = Convert.ToHexString(aesKey)
    },
    Identifiers = new Identifiers
    {
        Ssn = new Ssn
        {
            Strategies = new List<SsnFilterStrategy>
            {
                new SsnFilterStrategy { Strategy = "CRYPTO_REPLACE" }
            }
        }
    }
};

The key may also be supplied as an env:NAME reference, which is resolved from the environment variable NAME at filter time.

FPE_ENCRYPT_REPLACE

Format-preserving encryption using the FF3-1 cipher. Requires an Fpe block on the Policy with a hex-encoded key and a tweak. The tweak is required (FF3-1 needs a 56- or 64-bit tweak); the strategy falls back to REDACT when the Fpe block, key, or tweak is missing.

var policy = new Policy
{
    Name = "fpe-policy",
    Fpe = new Fpe
    {
        Key   = "EF4359D8D580AA4F7F036D6F04FC6A94",  // hex-encoded key
        Tweak = "D8E7920AFA330A73"                    // hex-encoded tweak (required)
    },
    Identifiers = new Identifiers
    {
        CreditCard = new CreditCard
        {
            Strategies = new List<CreditCardFilterStrategy>
            {
                new CreditCardFilterStrategy { Strategy = "FPE_ENCRYPT_REPLACE" }
            }
        }
    }
};

Like the AES key, the FPE key and tweak may be supplied as env:NAME references.

HASH_SHA256_REPLACE

Replaces the token with its lower-case SHA-256 hex digest. Optionally appends a random salt before hashing when salt: true is set.

new EmailAddressFilterStrategy
{
    Strategy = "HASH_SHA256_REPLACE",
    Salt = true     // prepend random salt before hashing
}

{ "strategy": "HASH_SHA256_REPLACE", "salt": true }

LAST_4

Keeps the last four characters of the token and discards the rest. If the token is shorter than four characters, the full token is returned.

new CreditCardFilterStrategy { Strategy = "LAST_4" }

Output example: 1234 (from 4111-1111-1111-1234).

MASK

Replaces characters with a mask character (default *). Use maskLength to control how many characters are written:

`maskLength` value	Behaviour
`"same"` (default)	Mask has the same length as the original token
Integer string, e.g. `"6"`	Mask has exactly that many characters (capped at token length)

new SsnFilterStrategy
{
    Strategy = "MASK",
    MaskCharacter = "#",
    MaskLength = "same"
}

{ "strategy": "MASK", "maskCharacter": "#", "maskLength": "6" }

SAME

Marks the token as detected but leaves the text unchanged. Useful when you want spans and metadata without altering the output.

new PhoneNumberFilterStrategy { Strategy = "SAME" }

ABBREVIATE

Replaces the token with the uppercase initial of each whitespace-separated word.

new FirstNameFilterStrategy { Strategy = "ABBREVIATE" }

Output example: JS (from John Smith).

MAP_REPLACE

Replaces a detected value using a lookup table, resolving each token in this order:

Lookup table. If the token is a key in the table, its mapped value is used.
Generator. If the token is not in the table and a generator is configured, the generator produces a replacement. The value is rejected (and the strategy falls through to the fallback) if the generator fails or times out, returns a blank value, returns the original token again (case-insensitively), or produces a value that itself contains detectable PII (each generated value is re-scanned through the filter pipeline to confirm the generator did not reintroduce sensitive data).
Fallback strategy. fallbackStrategy (default REDACT) is applied. A detected value is never left in the clear.

The table is built from inline mappings and/or tab-separated mappingFiles (one key<TAB>value pair per row), merged once when the filter is built. Inline mappings override entries loaded from files; among files, a later file overrides an earlier one for a duplicate key. caseSensitive (default false) controls whether keys and tokens are matched case-insensitively.

new SurnameFilterStrategy
{
    Strategy = "MAP_REPLACE",
    Mappings = new Dictionary<string, string> { ["Smith"] = "Jones" },
    MappingFiles = new List<string> { "/etc/phileas/surnames.tsv" },
    CaseSensitive = false,
    Generator = "local",          // name of a generator in the policy's generators block
    FallbackStrategy = "REDACT"
}

Generated values are routed through the same context-scoped cache as RANDOM_REPLACE: when replacementScope is CONTEXT, a repeated token in the same context reuses its first replacement and the generator is not called again. With replacementScope DOCUMENT (the default), each occurrence is generated independently.

Generators

A generator is declared once in the policy's top-level generators block and referenced by name from a MAP_REPLACE strategy's generator property. Generators target a local model endpoint inside your deployment boundary so detected values are not sent to a third party. The ollama type calls a local Ollama-compatible /api/generate endpoint.

{
  "generators": {
    "local": {
      "type": "ollama",
      "endpoint": "http://localhost:11434",
      "model": "llama3.1",
      "prompt": "Rewrite {{token}} as a different but structurally similar value. Return only the value.",
      "timeoutMs": 2000
    }
  }
}

The prompt template supports the {{token}} placeholder (the detected value) and {{label}} (its entity label). timeoutMs is required so a generator can never block the pipeline: on timeout the strategy applies its fallbackStrategy. A generator name that does not resolve to a defined generator is ignored, and the strategy uses its fallback.

TRUNCATE

Keeps only the first character of the token.

new EmailAddressFilterStrategy { Strategy = "TRUNCATE" }

SHIFT_DATE

Date filters only. Applies to DateFilterStrategy; ignored by all other filter types.

Shifts a detected date forward or backward by a configurable number of days, months, and/or years while preserving the original date format. All three offsets default to 0 and can be combined freely. Negative values shift the date into the past.

Supported date formats

Example	Format
`1/15/1990`	Numeric M/D/YYYY
`January 15, 1990`	Full month name
`15-Jan-1990`	Day-abbreviated-month-year
`Jan 15, 1990`	Abbreviated month name

If the detected token cannot be parsed as a date, or if the Date filter type is not active, SHIFT_DATE falls back to REDACT.

Properties

Property	JSON key	Type	Default	Description
`Days`	`days`	`int`	`0`	Days to add (negative to subtract)
`Months`	`months`	`int`	`0`	Months to add (negative to subtract)
`Years`	`years`	`int`	`0`	Years to add (negative to subtract)

C# example

using Phileas.Policy.Filters;
using Phileas.Policy.Filters.Strategies;

var policy = new Policy
{
    Name = "date-shift-policy",
    Identifiers = new Identifiers
    {
        Date = new Date
        {
            Strategies = new List<DateFilterStrategy>
            {
                new DateFilterStrategy
                {
                    Strategy = "SHIFT_DATE",
                    Years  = -1,
                    Days   = 14
                }
            }
        }
    }
};

JSON policy example

{
  "name": "date-shift-policy",
  "identifiers": {
    "date": {
      "dateFilterStrategies": [{
        "strategy": "SHIFT_DATE",
        "years": -1,
        "days": 14
      }]
    }
  }
}

Example transformation

Input	Output
`DOB: January 15, 1990`	`DOB: January 29, 1989`
`Admitted: 3/1/2024`	`Admitted: 3/15/2023`

Salting

Any strategy can optionally append a random 16-byte Base64 salt to the token before processing by setting salt: true. The generated salt is included in the Span.Salt field of the result so it can be recorded for auditing or reproduction.

new SsnFilterStrategy
{
    Strategy = "HASH_SHA256_REPLACE",
    Salt = true
}

Redaction Bar Color

Any strategy can set an optional color that controls the color of the bar drawn over the spans it redacts when the output is a PDF or image. It overrides the policy-wide config.pdf.redactionColor for those spans; when unset, the policy-wide color (default black) applies. color has no effect on text redaction.

new SsnFilterStrategy
{
    Strategy = "REDACT",
    Color = "red"
}

Accepted values are a named color (black, white, red, orange, yellow, green, blue, gray) or a 6-digit hex string matching ^#[0-9A-Fa-f]{6}$ (for example #ff8800). An unrecognized or malformed value renders as black, so a detected span is never left un-redacted. Because color overrides the policy-wide color, a malformed strategy color renders black rather than falling back to config.pdf.redactionColor.

Combined with strategy conditions, this colors spans by detection confidence or any other condition field:

"creditCardFilterStrategies": [
  { "strategy": "REDACT", "color": "green",  "condition": "confidence >= 0.9" },
  { "strategy": "REDACT", "color": "orange", "condition": "confidence < 0.9" }
]

Strategy Conditions

Strategies can include a condition property that controls when they are applied. When multiple strategies are defined, phileas-dotnet evaluates their conditions in order and applies the first strategy whose condition evaluates to true.

new EmailAddressFilterStrategy
{
    Strategy = "MASK",
    Condition = "confidence > 0.8 and context == \"internal\""
}

Supported condition fields: - confidence - Detection confidence (0.0 to 1.0) - context - Context name passed to FilterService.Filter() - token - The detected text value - type - Classification type (e.g., "PER", "LOC") - population - Census population of a detected ZIP code

Supported operators: - Comparison: ==, !=, >, <, >=, <=, is, is not - String: startswith - Logical: and

See Filter Conditions for detailed examples and usage patterns.

Configuring Strategies Per Identifier

Each identifier type has a corresponding strategy class (e.g. SsnFilterStrategy, EmailAddressFilterStrategy). Set the Strategies list on the identifier:

var policy = new Policy
{
    Name = "multi-strategy",
    Identifiers = new Identifiers
    {
        Ssn = new Ssn
        {
            Strategies = new List<SsnFilterStrategy>
            {
                new SsnFilterStrategy { Strategy = "MASK" }
            }
        },
        EmailAddress = new EmailAddress
        {
            Strategies = new List<EmailAddressFilterStrategy>
            {
                new EmailAddressFilterStrategy { Strategy = "HASH_SHA256_REPLACE" }
            }
        },
        PhoneNumber = new PhoneNumber
        {
            Strategies = new List<PhoneNumberFilterStrategy>
            {
                new PhoneNumberFilterStrategy { Strategy = "LAST_4" }
            }
        }
    }
};