Skip to content

Zip Codes

Filter

This filter identifies zip codes in text.

Please note that the information used to map a zip code to a population is derived from publicly available USA census data. While population mapping to zip code may be near the actual numbers it will most likely not be exact. Please use the POPULATION conditional with this in mind.

Required Parameters

This filter has no required parameters.

Optional Parameters

Parameter Description Default Value
zipCodeFilterStrategies A list of filter strategies. None
enabled When set to false, the filter will be disabled and not applied true
ignored A list of terms to be ignored by the filter. None
requireDelimiter When set to false, the filter will not require a dash in 9 digit zip codes, e.g. 12345-6789. Setting to false may increase the number of zip code false positives. true
windowSize Sets the size of the window (in terms) surrounding a span to look for contextual terms. If set, this value overrides the value of span.window.size in the configuration. The value of span.window.size which is by default 5.
priority The priority (integer) of this filter. Valid values are any positive integer, where a higher value indicates a higher priority. Priority is used for tie-breaking when two spans may be otherwise identical. 0
validate When set to true, the database of zip codes from the US census will be checked for the first 5 digits of the zip code. If not found, the span will be marked as not applied. Use this with caution as US zip codes can change frequently and the database included may not be comprehensive. false

Filter Strategies

The filter may have zero or more filter strategies. When no filter strategy is given the default strategy of REDACT is used. When multiple filter strategies are given the filter strategies will be applied in order as they are listed. See Filter Strategies for details.

Strategy Description
REDACT Replace the sensitive text with a placeholder.
RANDOM_REPLACE Replace the sensitive text with a similar, random value.
STATIC_REPLACE Replace the sensitive text with a given value.
CRYPTO_REPLACE Replace the sensitive text with its encrypted value.
HASH_SHA256_REPLACE Replace the sensitive text with its SHA256 hash value.
TRUNCATE Replace the sensitive text by removing everything except x characters. (Set the number of characters to leave using the truncateLeaveCharacters parameter of the filter strategy.)
ZERO_LEADING Replace the sensitive text by zeroing the first 3 digits.

Conditions

Each filter strategy may have one condition. See Conditions for details.

Please note that the information used to map a zip code to a population is derived from publicly available USA census data. While population mapping to zip code may be near the actual numbers it will most likely not be exact. Please use the POPULATION conditional with this in mind.

Conditional Description Operators
TOKEN Compares the value of the sensitive text. == , !=
CONTEXT Compares the filtering context. == , !=
CONFIDENCE Compares the confidence in the sensitive text against a threshold value. < , <=, > , >=, ==, !=
POPULATION Compares the population of the zip code against the 2010 census values. < , <=, > , >=, ==, !=

Example Policy

{
   "name": "zip-code-example",
   "identifiers": {
      "zipCode": {
         "zipCodeFilterStrategies": [
            {
               "strategy": "REDACT",
               "redactionFormat": "{{{REDACTED-%t}}}"
            }
         ]
      }
   }
}