Policies
A Policy is the primary configuration object in phileas-dotnet. It defines which PII types to detect, how to handle each type, global settings, and values that should never be redacted.
Policy Structure
public class Policy
{
public string Name { get; set; } // in-memory label only (not serialized)
public Config Config { get; set; }
public Crypto? Crypto { get; set; }
public Fpe? Fpe { get; set; }
public Identifiers Identifiers { get; set; }
public List<Ignored> Ignored { get; set; }
public List<IgnoredPattern> IgnoredPatterns { get; set; }
public Graphical Graphical { get; set; }
}
The canonical policy JSON has no top-level
name—Nameis an in-memory convenience label and is marked[JsonIgnore]. UsePolicySerializerto load and save policies; it applies the canonical options (null fields omitted) and resolves${ENV_VAR}/env:NAMEplaceholders.
Equivalent JSON
{
"identifiers": {
"ssn": {},
"emailAddress": {}
}
}
Config
Config holds global settings, grouped into sub-objects.
| Property | Type | JSON key | Description |
|---|---|---|---|
Splitting |
Splitting |
splitting |
Splits long input into pieces before filtering. |
Pdf |
Pdf |
pdf |
PDF redaction rendering options. |
PostFilters |
PostFilters |
postFilters |
Cleanup applied to replacements (see PostFilters). |
Analysis |
Analysis |
analysis |
Analysis options. |
Splitting
| Property | JSON key | Default | Description |
|---|---|---|---|
Enabled |
enabled |
false |
Enable splitting of long inputs. |
Threshold |
threshold |
10000 |
Minimum input length (characters) before splitting applies. |
Method |
method |
"newline" |
Split method (e.g. "newline"). |
var policy = new Policy
{
Name = "my-policy",
Config = new Config
{
Splitting = new Splitting { Enabled = true, Threshold = 5000, Method = "newline" }
}
};
The per-filter context window size is configured on each identifier via
WindowSize(see Common Options), not onConfig.
Crypto
Crypto provides the AES key used by the CRYPTO_REPLACE filter strategy, which encrypts with AES-GCM.
| Property | Type | JSON key | Description |
|---|---|---|---|
Key |
string? |
key |
Hex-encoded 16, 24, or 32-byte AES key, or an env:NAME reference. |
Iv |
string? |
iv |
Present in the model but unused by AES-GCM, which generates a fresh random nonce per value. |
var policy = new Policy
{
Name = "encrypted-policy",
Crypto = new Crypto
{
Key = Convert.ToHexString(aesKey) // hex-encoded AES key
}
};
See Filter Strategies — CRYPTO_REPLACE for usage.
Fpe
Fpe provides the key and tweak used by the FPE_ENCRYPT_REPLACE (Format Preserving Encryption, FF3-1) strategy.
| Property | Type | JSON key | Description |
|---|---|---|---|
Key |
string? |
key |
Hex-encoded FF3-1 key, or an env:NAME reference. |
Tweak |
string? |
tweak |
Hex-encoded tweak (required by FF3-1; 56- or 64-bit), or an env:NAME reference. |
See Filter Strategies — FPE_ENCRYPT_REPLACE.
Identifiers
Identifiers lists which PII types the policy should detect. Set the corresponding property to a non-null value to enable that filter. See Supported Identifiers for the full list.
var identifiers = new Identifiers
{
Ssn = new Ssn(),
EmailAddress = new EmailAddress(),
PhoneNumber = new PhoneNumber()
};
Common Identifier Options
Each identifier class extends AbstractPolicyFilter and supports these common options:
| Property | JSON key | Type | Default | Description |
|---|---|---|---|---|
Enabled |
enabled |
bool |
true |
Whether the filter is active. |
Ignored |
ignored |
List<string>? |
null |
Exact values that should not be redacted. |
IgnoredFiles |
ignoredFiles |
List<string>? |
null |
Files whose lines provide additional ignored terms. |
IgnoredPatterns |
ignoredPatterns |
List<IgnoredPattern>? |
null |
Regex patterns whose matches are not redacted. |
WindowSize |
windowSize |
int |
0 |
Context words on each side of a match; 0 uses the default (5). |
Priority |
priority |
int |
0 |
Higher-priority filter spans win when spans overlap. |
Custom Identifiers
The identifiers.identifiers array holds custom regex-based identifiers. Each detects values with a user-supplied pattern.
| Property | JSON key | Type | Default | Description |
|---|---|---|---|---|
Classification |
classification |
string |
custom-identifier |
Label applied to matches (used as the filter type). |
Pattern |
pattern |
string |
\b[A-Z0-9_-]{6,}\b |
The regular expression to match. |
CaseSensitive |
caseSensitive |
bool |
true |
Whether matching is case-sensitive. |
GroupNumber |
groupNumber |
int |
0 |
The capture group to extract as the matched value (0 is the whole match). |
Validator |
validator |
string or object |
null |
An optional named, post-match validator (see below). |
Validators
A regular expression matches a format, not a valid value. The optional validator runs a named, built-in check on each match and keeps the match only if the check passes, so a generic identifier can reject format-valid but checksum-invalid values without embedding executable code in the policy.
The validator may be written as a string, or as an object when it takes parameters:
"validator": "luhn"
"validator": { "name": "mod11", "params": { "variant": "cpf" } }
An unknown or not-yet-implemented validator name is a policy error and the filter raises rather than silently skipping the check. The available validators match the Phileas (Java) implementation for the same input.
| Validator | Parameters | Description |
|---|---|---|
luhn |
none | Standard mod-10 Luhn checksum over the digits of the match (separators ignored). |
mod11 |
variant: cpf or cnpj |
Weighted-sum mod-11 check digits for the Brazilian CPF and CNPJ. |
mod97 |
variant: nir or iban; substitutions (nir) |
Control from a value mod 97: the French INSEE/NIR (with Corsica substitutions) or an IBAN (MOD-97-10). |
mod23-letter |
substitutions |
Control letter from a 23-entry table, for the Spanish DNI and NIE (leading X/Y/Z substitution). |
es-cif |
none | Spanish CIF control character (digit or letter). |
de-steuerid |
none | German tax ID (Steuer-ID): digit-repetition rule plus ISO/IEC 7064 MOD 11,10 check digit. |
de-personalausweis |
none | German ID card number: ICAO 9303 7-3-1 check digit. |
bic-structural |
none | SWIFT/BIC structure (ISO 9362) with a valid ISO 3166 country segment. |
Ignored Values
Use the ignored list on an identifier to whitelist specific values:
var policy = new Policy
{
Name = "policy",
Identifiers = new Identifiers
{
EmailAddress = new EmailAddress
{
Ignored = new List<string> { "no-reply@example.com" }
}
}
};
Ignored Patterns
Use ignoredPatterns to whitelist tokens matching a regular expression:
var policy = new Policy
{
Name = "policy",
Identifiers = new Identifiers
{
EmailAddress = new EmailAddress
{
IgnoredPatterns = new List<IgnoredPattern>
{
new IgnoredPattern
{
Name = "internal-emails",
Pattern = @"^[\w.+-]+@internal\.corp$",
CaseSensitive = false
}
}
}
}
};
| Property | JSON key | Type | Default | Description |
|---|---|---|---|---|
Name |
name |
string? |
null |
Human-readable name for the pattern. |
Pattern |
pattern |
string? |
null |
Regular expression to match against the detected token. |
CaseSensitive |
caseSensitive |
bool |
false |
Whether the pattern match is case-sensitive. |
Global Ignored Values
The top-level Ignored and IgnoredPatterns lists apply across all identifier types — any span whose text matches is dropped, no matter which filter produced it.
Each Ignored entry is a named set of terms:
| Property | JSON key | Type | Default | Description |
|---|---|---|---|---|
Name |
name |
string? |
null |
Optional name for the set. |
Terms |
terms |
List<string> |
[] |
Exact values to ignore. |
Files |
files |
List<string> |
[] |
Files whose lines provide additional ignored terms. |
CaseSensitive |
caseSensitive |
bool |
false |
Whether term comparison is case-sensitive. |
var policy = new Policy
{
Name = "policy",
Ignored = new List<Ignored>
{
new Ignored
{
Name = "test-values",
Terms = new List<string> { "000-00-0000", "test@example.com" },
CaseSensitive = false
}
}
};
PostFilters
PostFilters (on Config.PostFilters) controls lightweight cleanup applied to each replaced token after the strategy produces a replacement value.
| Property | JSON key | Type | Default | Description |
|---|---|---|---|---|
RemoveTrailingPeriods |
removeTrailingPeriods |
bool |
true |
Strip trailing period characters from the replacement. |
RemoveTrailingSpaces |
removeTrailingSpaces |
bool |
true |
Strip trailing whitespace from the replacement. |
RemoveTrailingNewLines |
removeTrailingNewLines |
bool |
true |
Strip trailing newline characters from the replacement. |
var policy = new Policy
{
Name = "my-policy",
Config = new Config
{
PostFilters = new PostFilters
{
RemoveTrailingNewLines = true,
RemoveTrailingPeriods = false,
RemoveTrailingSpaces = true
}
}
};
Serializing Policies
Use PolicySerializer to convert policies to and from JSON:
using Phileas.Policy;
string json = PolicySerializer.SerializeToJson(policy);
Policy loaded = PolicySerializer.DeserializeFromJson(json);
PolicySerializer omits null fields (matching the canonical schema) and resolves ${ENV_VAR} / env:NAME placeholders from environment variables during deserialization. Policies can also be authored in PhiSQL and compiled with Policy.FromPhiSQL(phisql).
Example: Full Policy
var policy = new Policy
{
Name = "full-example",
Config = new Config
{
Splitting = new Splitting { Enabled = true, Threshold = 5000 }
},
Identifiers = new Identifiers
{
Ssn = new Ssn
{
Strategies = new List<SsnFilterStrategy>
{
new SsnFilterStrategy { Strategy = "MASK" }
},
Ignored = new List<string> { "000-00-0000" }
},
EmailAddress = new EmailAddress
{
Strategies = new List<EmailAddressFilterStrategy>
{
new EmailAddressFilterStrategy { Strategy = "HASH_SHA256_REPLACE" }
}
}
}
};