Skip to content

Policies

A Policy is the primary configuration object in phileas-dotnet. It defines which PII types to detect, how to handle each type, global settings, and values that should never be redacted.

Policy Structure

public class Policy
{
    public string Name { get; set; }                       // in-memory label only (not serialized)
    public Config Config { get; set; }
    public Crypto? Crypto { get; set; }
    public Fpe? Fpe { get; set; }
    public Identifiers Identifiers { get; set; }
    public List<Ignored> Ignored { get; set; }
    public List<IgnoredPattern> IgnoredPatterns { get; set; }
    public Graphical Graphical { get; set; }
}

The canonical policy JSON has no top-level nameName is an in-memory convenience label and is marked [JsonIgnore]. Use PolicySerializer to load and save policies; it applies the canonical options (null fields omitted) and resolves ${ENV_VAR} / env:NAME placeholders.

Equivalent JSON

{
  "identifiers": {
    "ssn": {},
    "emailAddress": {}
  }
}

Config

Config holds global settings, grouped into sub-objects.

Property Type JSON key Description
Splitting Splitting splitting Splits long input into pieces before filtering.
Pdf Pdf pdf PDF redaction rendering options.
PostFilters PostFilters postFilters Cleanup applied to replacements (see PostFilters).
Analysis Analysis analysis Analysis options.

Splitting

Property JSON key Default Description
Enabled enabled false Enable splitting of long inputs.
Threshold threshold 10000 Minimum input length (characters) before splitting applies.
Method method "newline" Split method (e.g. "newline").
var policy = new Policy
{
    Name = "my-policy",
    Config = new Config
    {
        Splitting = new Splitting { Enabled = true, Threshold = 5000, Method = "newline" }
    }
};

The per-filter context window size is configured on each identifier via WindowSize (see Common Options), not on Config.


Crypto

Crypto provides the AES key used by the CRYPTO_REPLACE filter strategy, which encrypts with AES-GCM.

Property Type JSON key Description
Key string? key Hex-encoded 16, 24, or 32-byte AES key, or an env:NAME reference.
Iv string? iv Present in the model but unused by AES-GCM, which generates a fresh random nonce per value.
var policy = new Policy
{
    Name = "encrypted-policy",
    Crypto = new Crypto
    {
        Key = Convert.ToHexString(aesKey)   // hex-encoded AES key
    }
};

See Filter Strategies — CRYPTO_REPLACE for usage.


Fpe

Fpe provides the key and tweak used by the FPE_ENCRYPT_REPLACE (Format Preserving Encryption, FF3-1) strategy.

Property Type JSON key Description
Key string? key Hex-encoded FF3-1 key, or an env:NAME reference.
Tweak string? tweak Hex-encoded tweak (required by FF3-1; 56- or 64-bit), or an env:NAME reference.

See Filter Strategies — FPE_ENCRYPT_REPLACE.


Identifiers

Identifiers lists which PII types the policy should detect. Set the corresponding property to a non-null value to enable that filter. See Supported Identifiers for the full list.

var identifiers = new Identifiers
{
    Ssn           = new Ssn(),
    EmailAddress  = new EmailAddress(),
    PhoneNumber   = new PhoneNumber()
};

Common Identifier Options

Each identifier class extends AbstractPolicyFilter and supports these common options:

Property JSON key Type Default Description
Enabled enabled bool true Whether the filter is active.
Ignored ignored List<string>? null Exact values that should not be redacted.
IgnoredFiles ignoredFiles List<string>? null Files whose lines provide additional ignored terms.
IgnoredPatterns ignoredPatterns List<IgnoredPattern>? null Regex patterns whose matches are not redacted.
WindowSize windowSize int 0 Context words on each side of a match; 0 uses the default (5).
Priority priority int 0 Higher-priority filter spans win when spans overlap.

Custom Identifiers

The identifiers.identifiers array holds custom regex-based identifiers. Each detects values with a user-supplied pattern.

Property JSON key Type Default Description
Classification classification string custom-identifier Label applied to matches (used as the filter type).
Pattern pattern string \b[A-Z0-9_-]{6,}\b The regular expression to match.
CaseSensitive caseSensitive bool true Whether matching is case-sensitive.
GroupNumber groupNumber int 0 The capture group to extract as the matched value (0 is the whole match).
Validator validator string or object null An optional named, post-match validator (see below).

Validators

A regular expression matches a format, not a valid value. The optional validator runs a named, built-in check on each match and keeps the match only if the check passes, so a generic identifier can reject format-valid but checksum-invalid values without embedding executable code in the policy.

The validator may be written as a string, or as an object when it takes parameters:

"validator": "luhn"
"validator": { "name": "mod11", "params": { "variant": "cpf" } }

An unknown or not-yet-implemented validator name is a policy error and the filter raises rather than silently skipping the check. The available validators match the Phileas (Java) implementation for the same input.

Validator Parameters Description
luhn none Standard mod-10 Luhn checksum over the digits of the match (separators ignored).
mod11 variant: cpf or cnpj Weighted-sum mod-11 check digits for the Brazilian CPF and CNPJ.
mod97 variant: nir or iban; substitutions (nir) Control from a value mod 97: the French INSEE/NIR (with Corsica substitutions) or an IBAN (MOD-97-10).
mod23-letter substitutions Control letter from a 23-entry table, for the Spanish DNI and NIE (leading X/Y/Z substitution).
es-cif none Spanish CIF control character (digit or letter).
de-steuerid none German tax ID (Steuer-ID): digit-repetition rule plus ISO/IEC 7064 MOD 11,10 check digit.
de-personalausweis none German ID card number: ICAO 9303 7-3-1 check digit.
bic-structural none SWIFT/BIC structure (ISO 9362) with a valid ISO 3166 country segment.

Ignored Values

Use the ignored list on an identifier to whitelist specific values:

var policy = new Policy
{
    Name = "policy",
    Identifiers = new Identifiers
    {
        EmailAddress = new EmailAddress
        {
            Ignored = new List<string> { "no-reply@example.com" }
        }
    }
};

Ignored Patterns

Use ignoredPatterns to whitelist tokens matching a regular expression:

var policy = new Policy
{
    Name = "policy",
    Identifiers = new Identifiers
    {
        EmailAddress = new EmailAddress
        {
            IgnoredPatterns = new List<IgnoredPattern>
            {
                new IgnoredPattern
                {
                    Name = "internal-emails",
                    Pattern = @"^[\w.+-]+@internal\.corp$",
                    CaseSensitive = false
                }
            }
        }
    }
};
Property JSON key Type Default Description
Name name string? null Human-readable name for the pattern.
Pattern pattern string? null Regular expression to match against the detected token.
CaseSensitive caseSensitive bool false Whether the pattern match is case-sensitive.

Global Ignored Values

The top-level Ignored and IgnoredPatterns lists apply across all identifier types — any span whose text matches is dropped, no matter which filter produced it.

Each Ignored entry is a named set of terms:

Property JSON key Type Default Description
Name name string? null Optional name for the set.
Terms terms List<string> [] Exact values to ignore.
Files files List<string> [] Files whose lines provide additional ignored terms.
CaseSensitive caseSensitive bool false Whether term comparison is case-sensitive.
var policy = new Policy
{
    Name = "policy",
    Ignored = new List<Ignored>
    {
        new Ignored
        {
            Name = "test-values",
            Terms = new List<string> { "000-00-0000", "test@example.com" },
            CaseSensitive = false
        }
    }
};

PostFilters

PostFilters (on Config.PostFilters) controls lightweight cleanup applied to each replaced token after the strategy produces a replacement value.

Property JSON key Type Default Description
RemoveTrailingPeriods removeTrailingPeriods bool true Strip trailing period characters from the replacement.
RemoveTrailingSpaces removeTrailingSpaces bool true Strip trailing whitespace from the replacement.
RemoveTrailingNewLines removeTrailingNewLines bool true Strip trailing newline characters from the replacement.
var policy = new Policy
{
    Name = "my-policy",
    Config = new Config
    {
        PostFilters = new PostFilters
        {
            RemoveTrailingNewLines = true,
            RemoveTrailingPeriods  = false,
            RemoveTrailingSpaces   = true
        }
    }
};

Serializing Policies

Use PolicySerializer to convert policies to and from JSON:

using Phileas.Policy;

string json = PolicySerializer.SerializeToJson(policy);
Policy loaded = PolicySerializer.DeserializeFromJson(json);

PolicySerializer omits null fields (matching the canonical schema) and resolves ${ENV_VAR} / env:NAME placeholders from environment variables during deserialization. Policies can also be authored in PhiSQL and compiled with Policy.FromPhiSQL(phisql).


Example: Full Policy

var policy = new Policy
{
    Name = "full-example",
    Config = new Config
    {
        Splitting = new Splitting { Enabled = true, Threshold = 5000 }
    },
    Identifiers = new Identifiers
    {
        Ssn = new Ssn
        {
            Strategies = new List<SsnFilterStrategy>
            {
                new SsnFilterStrategy { Strategy = "MASK" }
            },
            Ignored = new List<string> { "000-00-0000" }
        },
        EmailAddress = new EmailAddress
        {
            Strategies = new List<EmailAddressFilterStrategy>
            {
                new EmailAddressFilterStrategy { Strategy = "HASH_SHA256_REPLACE" }
            }
        }
    }
};