Philter removes and anonymizes Personally Identifiable Information (PII) and Protected Health Information (PHI) from natural language text.

Capabilities of Philter

  • Removal of PHI from text. Philter uses natural language processing to identify protected health information from text.
  • Anonymization (de-identification) of PHI from text. Philter can replace PII and PHI with similar but random values. For example, names are replaced with random names and phone numbers with random phone numbers.
  • Detection of PHI in text. (Answers the question, “Is there PHI in my text?”) This capability involves analyzing your text to assign it a numeric score that indicates the likelihood of protected health information in the text.
  • Configurable sensitivity level to control Philter’s detection and removal.

How Philter Works

Philter uses a multi-phased approach to identify and remove PII and PHI. Philter analyzes the text it receives for PII and PHI. Some information is recognizable through patterns, such as social security numbers and phone numbers. Detecting other types of PHI is more complex because it does not follow patterns. Information such as patient names are identified through trained models created specifically to detect these types of information.

Please note that model-based detection is not an exact science and Philter’s accuracy and performance should be evaluated against your data prior to a production deployment. Adjusting Philter’s configured sensitivity can have a significant impact on the accuracy.

Parts of the text matching the patterns are replaced by user-configured placeholder text or anonymized random values.


Philter supports processing streaming text from Apache Kafka. When deployed on a cluster, Philter can efficiently process large amounts of text. Additionally, Philter can be used in Apache NiFi dataflows that process PII and PHI text.

Consistent Anonymization

Philter’s PHI anonymization can consistently replace PHI with realistic but random values consistently across documents and contexts. With consistent anonymization, your documents can remain useful for other applications and uses by not losing meaning.

Deploying Philter

Philter can be launched on Amazon Web Services and Microsoft Azure via their respective marketplaces. For other deployments or managed services please contact us and we can assist. Please note that prior to using Philter in a HIPAA-controlled environment additional configuration must be done for compliance. Refer to this guide or contact us for more information.

 Launch Philter in your cloud.