Philter 1.5.0

Happy Friday! We are in the process of publishing Philter 1.5.0. Philter identifies and removes sensitive information in text. Look for Philter 1.5.0 to be available on the cloud marketplaces soon.

This version has a few new features in addition to minor improvements and fixes. The new features are described below.

New “Section” Filter

Philter 1.5.0 includes a new filter type called a “Section.” This filter type lets you specify patterns that indicate the start and end of a section of text. For example, if your text has sentences or even paragraphs denoted with some marker, you can use the Section filter to redact those sentences or paragraphs. You just give the filter the regular expression patterns for the start and end markings. We have added the Section filter to the filter profiles documentation.

Amazon S3 to Store Filter Profiles

We have added the ability to store the filter profiles in an Amazon S3 bucket. The benefits of this is that now filter profiles can be shared across multiple instances of Philter. Previously, if you were running two instances of Philter you would have to update the filter profiles on each instance. By storing the filter profiles in S3 you can just update the filter profiles once via Philter’s API. This does require a cache. The cache stores the filter profiles to lower the latency and reduce the number of calls to S3. (More on the cache below.)

We have published some CloudFormation and Terraform scripts to help with creating this architecture on GitHub.

Consolidated Caches

Philter previously used caches for the random anonymization values. With the introduction of using a cache for storing the profiles in S3 we have consolidated those caches into a single cache. Because of this, the configuration settings have been slightly renamed to reflect this. We have updated Philter’s documentation with the renamed properties. Having a single cache means there is less to configure and fewer required resources.

If you are upgrading from a previous version you will need to change to the new cache property names.

Changeable Model File

The model file used by Philter can now be set in Philter’s Check out Philter’s documentation for the details. By being able to set the model being used you can now select which model is most applicable to your use-case and domain.