Filter sensitive information from text

Philter identifies and removes sensitive information such as PHI and PII from natural language text. Philter can be deployed in less than 10 minutes in AWS, Azure, and Google Cloud Compute Engine.

 

Philter

Philter® Release Notes

Release notes for Philter showing what’s new, what’s changed, and any known outstanding issues. Please contact us for clarification on any of the items listed on this page.

The release notes on this page use the following notation:

  • “New” indicates a feature or capability that has been added to the version.
  • “Tweak” denotes a minor change to a feature or capability.
  • “Fix” describes a change to a feature or capability to rectify the expected and observed behaviors.

Version 1.4.0 – TBD

  • New: Added optional basic authentication.
  • New: Added token condition to NerFilterStrategy. Can now write a condition on the token itself.
  • New: Added confidence condition to each type of filter strategy.
  • Tweak: Ignored spans are now dropped prior to overlapping spans.
  • Tweak: Docker container now uses Java 11.
  • Fix: Fixed potential issue with filtering state abbreviations.

Version 1.3.1 – February 20, 2020

Release Announcement Post

  • New: Added CRYPTO_REPLACE redaction option to encrypt sensitive values.
  • New: Added %v redaction variable to be substituted for the original value of the sensitive text. With %v you can now annotate sensitive information instead of masking or removing it.
  • New: Added filter condition based on the context. You can now make a filter condition be dependent on the value of the context.
  • New: Added filter for network MAC addresses.
  • New: Added support for TINs (Tax Identification Numbers) to the SSN filter.
  • New: Now requires Java 11.
  • New: Client can set document ID per filter request instead of document ID always being auto-generated per request. This allows for splitting documents between multiple requests to increase throughput.
  • New: Philter Enterprise Edition is now certified for Red Hat Enterprise Linux 8.
  • Tweak: GCP image is now built on CentOS 8.
  • Tweak: Credit card filter now supports credit card numbers containing dashes and spaces.

Version 1.3.0 – January 28, 2020

Release Announcement Post

This release focuses mainly on improving performance and error handling. No new functionality was added.

  • New: Now supports identifying URLs that use an IP address instead of a domain name.
  • New: Added option to URL filtering to require an URL to begin with http, https, or www.
  • Tweak: Removed trailing spaces from filtered values when they exist.
  • Tweak: Improving performance on API requests.
  • Tweak: Improving performance for larger documents.
  • Tweak: Changing format of generated document ID to be more random.
  • Tweak: Improved error handling if an API request to filter is not successful.
  • Tweak: Improved handling of just month names.
  • Tweak: When no filter strategies are specified, the default action will be to redact.

Version 1.2.0 – January 16, 2020

Release Announcement Post

  • New: Added ignore lists specific to each filter to list items that should never be removed. Each filter can have its own ignore list.
  • New: Added support for encrypted connections to Redis.
  • New: Added enabled property to individual filters in a filter profile. Filters having enabled=false will not be executed.
  • New: Added option to filter profile credit cards to also include invalid credit card numbers. (Credit card numbers that match the pattern but are not valid per the card’s number algorithm.)
  • New: Added option to filter profile to require dates be valid dates. (The date February 30 is not a valid date and would be excluded when enabled.)
  • New: Added option to filter profile for NER to remove punctuation prior to processing.
  • Fix: Fixed issue where conditionals may not be applied to NER entities.
  • Tweak: Added Philter version to status API response.

Version 1.1.0 – December 15, 2019

Release Announcement Post

  • New: Store changed from MongoDB to Elasticsearch for improved querying capabilities.
  • New: Added “auto” setting for distance to automatically calculate appropriate distance (fuzziness) of identified text.
  • New: Added ignore lists to filter profiles to support having a list of terms that are always not filtered.
  • New: Added support for using custom dictionaries in filter profiles. (Can now specify your own list of terms to be filtered.)
  • New: Added an explanation endpoint that describes how the identified PII/PHI was detected and filtered.
  • New: Added metrics per individual filter type.
  • New: Added “prefix” property for metrics to allow for improved metric organization.
  • New: Applying filter sensitivity level to NER entities.
  • New: Added API for managing filter profiles.
  • Fix: Fixed filter profile issue where appropriate filtering strategy may not be applied.

Version 1.0.1 – October 19, 2019

  • Tweak: Changed API HTTP response message when Philter is initializing.
  • Tweak: API endpoint /api/replacements returns HTTP 503 Service Unavailable when the replacement store is not enabled.
  • Improvement: Updated how identified spans are located.

Version 1.0.0 – October 7, 2019

  • Initial release.
  • Known issue: Philter’s API /api/filter endpoint will return HTTP 500 if Philter has not finished initializing. This will be made more user-friendly in a later version. As a workaround, use the /api/status endpoint to determine if Philter has finished initializing prior to calling /api/filter.