Filter Profiles

Philter identifies and manipulates PHI and PII through the use of various filters. Each filter is designed to operate on a single piece of PHI and PII, such as a phone number or an email address. The enabling and configuration of these filters are done through filter profiles.

A filter profile is a JSON document that defines which filters are enabled and how each is configured. You can have as many filter profiles as you want and you can select which filter profile to apply when utilizing Philter through Philter’s REST API. For example, if you are processing multiple collections of documents and each collection should be manipulating the contained PHI/PII differently, you can create a filter profile for each collection and select which filter profile to use when processing documents from each collection.

As mentioned, filter profiles are JSON documents and are expected to be named with a .json extension and placed in the ./profiles/ directory under the installation directory. This directory location is configurable via Philter’s property.

Example Filter Profile

The following is an example filter profile that enables a credit card filter, an IP address filter, and a zip code filter. The filter profile also specifies how each PHI/PII identifier should be manipulated through redaction by providing a redaction format, or by a specific method, such as the truncation of zip code digits.


Structure of a Filter Profile

A filter profile JSON schema is available to assist with creating and editing filter profiles.


The first thing to notice is that the filter profile shown above has a property called “name.” The value of this property uniquely identifies this filter profile so be sure each filter profile has a unique name. The name is how Philter knows the appropriate filter profile to apply so it is recommended to use something short but meaningful.

Identifier Strategies

The configuration for the filters can vary based on the filter but each has some common properties. Those properties are:

  • strategy – Valid values are REDACT and REPLACE.
  • redactionFormat – If the strategy is REDACT the text will be redacted based on the given pattern. The placeholder %t will be replaced by the identifier type. For instance, a credit card number as defined in the filter profile above would be redacted by {{{REDACTED-phone-number}}}.

Non-deterministic filters such as NLP-based methods for persons and locations include a severity property:

  • severity – Values values are LOW, MEDIUM, HIGH. A value of LOW will cause the filter to be more stringent in its detection, while a value of HIGH will cause the filter to be less stringent in its detection. For example, the severity can impact the degree to which misspelled words are identified as PHI or PII. For example, the name “Johnson” misspelled as “Johnsn” may be identified by a LOW severity, while a misspelling of “Jonsn” may be identified by a HIGH severity.
Was this article helpful to you? Yes No

How can we help?