Filter sensitive information from text

Philter finds, identifies, and removes sensitive information, such as PHI and PII, from natural language text. Run it in the cloud or in containers.


Philter

Philter® FAQ

Frequently asked questions about Philter. For any questions not answered here please contact us.

What is Philter?

Philter is an application that finds, identifies, and and removes sensitive information, such as Personally Identifiable Information (PII) and Protected Health Information (PHI), from natural language text. Philter runs in your private cloud so your sensitive data never has to traverse the public internet. Use Philter’s API to process text from virtually any system or process. Philter was designed from the ground up to be a key component of an effective data loss prevention strategy.

What types of sensitive information can Philter identify?

Philter can currently identify:  Ages, Bitcoin Addresses, Cities, Counties, Credit Cards, Custom Dictionaries, Custom Identifiers (medical record numbers, financial transaction numbers), Dates, Drivers License Numbers, Email Addresses, IBAN Codes, IP Addresses, MAC Addresses, Passport Numbers, Persons' Names, Phone/Fax Numbers, SSNs and TINs, Shipping Tracking Numbers, States, URLs, VINs, Zip Codes

How does Philter know what kinds of sensitive information to find?

Philter uses what we call filter profiles. A filter profile is a file that you give to Philter to tell it the types of sensitive information you want to identify. A filter profile lists the types of sensitive information (phone numbers, names, etc.), when to remove them, and how to remove them. Filter profiles are detailed in Philter’s User’s Guide. You can have as many filter profiles as you need to and you can select which one to use for each filter request.

Philter sounds great. How do I deploy it?

Philter can be deployed in your cloud with just a few clicks. Click here to get started.

How is Philter licensed?

How Philter is licensed depends on how Philter is deployed:

Option 1) Deployed as Docker Containers or On-Premises VMs

When Philter is deployed as Docker containers or on an on-premises VM a license key is required. Upon request we will provide a 30-day license key to allow for evaluation and testing. To continue use of Philter after 30 days a license must be purchased. License pricing is based on the expected average number of concurrent running instances of Philter over a 30 day period.

Click here to request a license quote.

Option 2) Deployed through the AWS, Azure, and Google Cloud Marketplaces

When Philter is deployed through the AWS, Azure, or Google Cloud Platform marketplace, there may be a free trial period during which there is no charge for the Philter software. (There may still be charges by the cloud platform for the underlying cloud platform resources.)

Once the Philter trial period ends, billing will be handled by the cloud platform for a seamless transition. Please refer to each cloud marketplace for up to date prices and the length of the free trial period because the pricing may vary between cloud platforms.

The cloud platform may offer discounted annual subscription plans in addition to the standard monthly billing.

No license key is required as the cloud provider handles the licensing automatically.

How do I send text to Philter to be filtered?

There are a few ways.

Using the API directly

Philter’s HTTP-based API accepts text to process and returns the processed text. Philter’s API allows it to be integrated into many types of systems and processes. See the API in Philter’s User Guide for more information, but here’s an example to send a text file to Philter for processing:

curl -k -X POST "https://localhost:8080/api/filter?c=context" -d @file.txt -H Content-Type "text/plain"

Using the Philter CLI

You can also use the Philter CLI. This small application provides convenient access to Philter’s API.

Using open source SDKs

There are also open source Philter SDKs for Java, .NET, and Go.

Is Philter guaranteed to find 100% of all sensitive information in my text?

Philter uses state of the art natural language processing (NLP) technology to identify sensitive information in text. These NLP methods use trained models created from a large corpus of text. The process of applying the model to text is non-deterministic. There are many factors that could affect the identification of sensitive information in your text such as how similar your text is to the corpus that was used to train the model, how the text is formatted, and the length of the text. For these reasons, it is important that you assess Philter’s performance prior to utilization in a production system.

The confidence value in the filter strategy condition can be used to tune the NLP engine’s detection. Each identified entity has an associated confidence score between 0 and 100 indicating the model’s estimate that the text is actually an entity, with 0 being the lowest confidence and 100 being the highest confidence. The confidence value in the filter strategy allows you to filter out entities based on the confidence. For example, the condition confidence > 75 means that entities having less than a 75 confidence value will be ignored and entities having a confidence value greater than 75 will be filtered from the text.

How does Philter compare with similar applications and services?

Comparing Philter with other applications and services such as Amazon Comprehend and Google Data Loss Prevention (DLP) API  is difficult because Philter is designed differently. Philter goes beyond simple identification of values in text. Philter includes additional features such as support for disambiguation, ignore lists, and value replacement and anonymization. These are features that may be possible with the other applications and services but would require you to implement them yourself on top of the other products.

Philter is not a software-as-a-service where its API is managed by us and consumed by you. Instead, Philter is an application that you deploy into your environment and interact with its API through connections on your network. With this design your text never has to leave your network. We believe this to be more secure than using a third-party API product where your text may traverse many networks during processing.

Philter also differentiates itself from other services in its flexibility. With Philter you can use your own models if you choose to do so and you have full control over the filtering process to tailor it to your specific needs.

What platforms are supported by Philter?

Philter supports several platforms and which platform is used may be determined by your choice of cloud provider.

  • AWS Marketplace – Amazon Linux
  • Microsoft Azure Marketplace – CentOS
  • Google Cloud Marketplace – CentOS
  • On-premises – RHEL and CentOS
  • Docker Containers

See the Philter Availability for a full listing.

Do you offer managed Philter services?

We do but currently only for AWS customers. Please see our Managed Services for more information.