Filter Strategies

A filter strategy defines how sensitive information identified by Philter should be manipulated, whether it is redacted, replaced, encrypted, or manipulated in some other fashion.

In a filter profile you specify the types of sensitive information that should be filtered. How Philter replaces each type of sensitive information is specific to each type. For instance, zip codes can be truncated based on the leading digits or zip code population while phone numbers are redacted. These replacements are performed by filter strategies.

A sample filter profile containing a filter strategy is shown below. In this example, email addresses will be redacted.

{
   "name": "email-address",
   "identifiers": {
      "emailAddress": {
         "emailAddressFilterStrategies": [
            {
               "strategy": "REDACT",
               "redactionFormat": "{{{REDACTED-%t}}}"
            }
         ]
      }
   }
}

Filter Strategies

The filter strategies are described below. Each filter type can specify zero or more filter strategies. When no filter strategies are given, Philter will default to REDACT for that filter type. When multiple filter strategies are given for a single filter type, the filter strategies will be applied in order as they are listed in the filter profile.

  • REDACT
  • CRYPTO_REPLACE
  • HASH_SHA256_REPLACE
  • RANDOM_REPLACE
  • STATIC_REPLACE
  • TRUNCATE
  • ZERO_LEADING

The REDACT Filter Strategy

The REDACT filter strategy replaces sensitive information with a given redaction format. You can put variables in the redaction format that Philter will replace when performing the redaction.

The available redaction variables are:

Redaction Variable Description
%t Will be replaced by the type of sensitive information. This is to allow you to know the type of sensitive information that was identified and redacted.
%l Will be replaced by the given classification for the type of sensitive information.
%v Will be replaced by the original value of the sensitive text. With %v you can annotated sensitive information instead of redacting, masking, or removing it.

The CRYPTO_REPLACE Filter Strategy

The CRYPTO_REPLACE filter strategy replaces each identified piece of sensitive information by encrypting it using the AES encryption algorithm. To use this filter strategy, the filter profile must include the details of the encryption key as shown below:

{
   "name":"sample-profile",
   "crypto": {
     "key": "....",
     "iv": "...."
   },
   ...

In the snippet of a filter profile shown above, a crypto element is is defined with a key and an initialization vector (iv). These two items are required to encrypt the sensitive information. To generate a key, run the following command:

openssl enc -e -aes-256-cbc -a -salt -P

You will be prompted to enter an encryption password. Once entered, the values of the key and iv will be shown. Copy and paste those values into the filter profile as shown above.

The HASH_SHA256_REPLACE Filter Strategy

The HASH_SHA256_REPLACE filter strategy replaces sensitive information with the SHA256 hash value of the sensitive information. To append a random salt value to each value prior to hashing, set the salt property to true. The salt value used will be returned in the explain response from Philter’s API.

The RANDOM_REPLACE Filter Strategy

Replaces the identified text with a fake value but of the same type. For example, an SSN will be replaced by a random text having the format ###-##-####, such as 123-45-6789. An email address will be replaced with a randomly generated email address. Available to all filter types.

The STATIC_REPLACE Filter Strategy

Replaces the identified text with a given static value. Available to all filter types.

The TRUNCATE Filter Strategy

Available only to zip codes, this strategy allows for truncating zip codes to only a select number of digits. Specify truncateDigits to set the desired number of leading digits to leave. For example, if truncateDigits is 2, the zip code 90210 will be truncated to 90***. Available only to a zip code filter.

The ZERO_LEADING Filter Strategy

Available only to zip codes, this strategy changes the first 3 digits of a zip code to be 0. For example, the zip code 90210 will be changed to 00010. Available only to a zip code filter.