Philter 1.6.0

PhilterPhilter 1.6.0 will be available soon through the cloud marketplaces and DockerHub. This is probably the most significant release of Philter other than the first release 1.0.0.

Version 1.6.0 has many new features and a few fixes. Instead of writing a single blog post for the entire release we are going to write a few separate blog posts on the significant new features. We will highlight the new features just down below in this post and then follow-up over the next few days with posts that go more in-depth on each of the new features. Check out Philter’s Release Notes.

Over the next few days we will be making updates to the Philter SDKs to accommodate the new features in Philter 1.6.0.

Deploy Philter

 Philter Version 
Launch Philter on AWS1.6.0
Launch Philter on Azure1.5.0
Launch Philter on Google Cloud1.6.0

New Features in Philter 1.6.0

The following are summaries of the new features added in Philter 1.6.0.

Alerts

The new alerts feature in Philter 1.6.0 allows you to cause Philter to generate an alert when a given filter condition is satisfied. For example, if you have a filter condition to only match a person’s name of “John Smith”, when this condition is satisfied Philter will generate an alert. The alert will be stored in Philter and can be retrieved and deleted using Philter’s new Alerts API. Details of the Alerts are in Philter’s User’s Guide.

Span Disambiguation

Sometimes a piece of sensitive information could be one of a few filter types, such as an SSN, a phone number, or a driver’s license number. The span disambiguation feature works to determine which of the potential filter types is most appropriate by analyzing the context of the sensitive information. Philter uses various natural language processing (NLP) techniques to determine which filter type the sensitive information most closely resembles. Because of the techniques used, the more text Philter sees the more accurate the span disambiguation will become.

Span disambiguation is documented in Philter’s User’s Guide.

New Filters: Bitcoin Address, IBAN Codes, US Passport Numbers, US Driver’s License Numbers

Philter 1.6.0 contains several new filter types:

  • Bitcoin Address – Identify bitcoin addresses.
  • IBAN Codes – Identify International Bank Account Numbers.
  • US Passport Numbers – Identify US passport numbers issued since 1981.
  • US Driver’s License Numbers – Identify US driver’s license numbers for all 50 states.

Each of these new filters are available through filter profiles.

New Replacement Strategy: SHA-256 with random salt values

We previously added the ability to encrypt sensitive information in text. In Philter 1.6.0 we have added the ability to hash sensitive information using SHA-256. When the hash replacement strategy is selected, each piece of sensitive text will be replaced by the SHA-256 value of the sensitive text. Additionally, the hash replacement strategy has a “salt” property that when enabled will cause Philter to append a random salt value to each piece of sensitive text prior to hashing. The random hash value will be included in the filter response.

Custom Dictionary Filters Can Now Use an External Dictionary File

Philter’s custom dictionary filter lets you specify a list of terms to identify as being sensitive. Prior to Philter 1.6.0, this list of terms had to be provided in the filter profile. With a long list it did not take long for the filter profile to become hard to read and even harder to manage. Now, instead of providing a list of terms in the filter profile you can simply provide the full path to a file that contains a list of terms. This keeps the filter profile compact and easier to manage. You can specify as many dictionary files as you need to and Philter will combine the terms when the filter profile is loaded.

Custom Dictionary Filters Now Have a “fuzzy” Property

Philter’s custom dictionary filter previously always used fuzzy detection. (Fuzzy detection is like a spell checker – a misspelled name such as “Davd” can be identified as “David.”) New in Philter 1.6.0 is a property on the custom dictionary filter called “fuzzy.” This property controls whether or not fuzzy detection is enabled. This property was added because when fuzzy detection is not needed you can get a significant performance increase. When not enabled, Philter uses an optimized data structure to identify the terms. If fuzzy detection is not enabled we do recommend disabling it to take advantage of the performance gain.

Changed “Type” to “Classification”

A few filter types had additional information that provided further description of the sensitive information. For instance, the entity filter had a type that identified the “type” of the entity such as “PER” for person. We have changed the property “type” to “classification” for clarity and uniformity. Be sure to update your filter profiles if you have any filter conditions that use “type” to use “classification” instead. It is a drop-in replacement and you can simply change “type” to “classification.”

Add Filter Condition for “Classification”

Philter 1.6.0 adds the ability to have a filter condition on “classification.”

Redis Cache Can Now Use a Self-Signed SSL Certificate

Philter 1.6.0 can now connect to a Redis cache that is using a self-signed certificate. New configuration settings for the truststore and keystore allow for trusting the self-signed certificate.

Fixes and Improvements in Philter 1.6.0

The following is a list of fixes and improvements made in Philter 1.6.0.

Fixed Potential MAC Address Issue

We found and fixed a potential issue where a MAC Address might not be identified correctly.

Fixed Potential Ignore Issue with Custom Dictionary Filters

We found and fixed a potential issue where a term in a custom dictionary that is also a term in an ignore list might not be ignored correctly.

Fixed Potential Issue with Credit Card Number Validation

We found and fixed a potential issue where a credit card number might not be validated correctly. This only applies when credit card validation is enabled.


Jeff Zemerick is the founder of Mountain Fog. He is a 10x certified AWS engineer, current chair of the Apache OpenNLP project, and experienced software engineer.

You can contact Jeff at jeff.zemerick@mtnfog.com or on LinkedIn.