Filtering API

Philter’s filtering API provides access to Philter’s ability to filter sensitive information from text and to retrieve the health status of Philter.

The curl example commands shown on this page are written assuming Philter has been enabled for SSL. If launched from a cloud marketplace, SSL will be enabled automatically with a self-signed SSL certificate. See the SSL/TLS settings for more information.

Contexts and Document IDs

Each filter request requires a context. When not provided in the request the context defaults to none. Contexts provide a means for logically grouping your documents during filtering. For example, documents pertaining to one health care provider may be submitted under the context hospital1, and documents pertaining to another health care provider may be submitted under the context hospital2.

Contexts and Consistent Anonymization

The context for each filter request impacts how sensitive information is replaced when found in the text. Consistent anonymization can be enabled at either the context or document level. When enabled at the context level, all instances of a given piece of sensitive information will be replaced consistently by the same value. This allows for maintaining meaning across all documents in the context.

Document Identifiers

Each filter request submitted to Philter is automatically assigned a document identifier. The document identifier is an alphanumeric value unique to that request. No two documents should be assigned the same document identifier. The document identifier is returned in the x-document-id header with each filter or explain API response.

Filtering API Methods

Filter Text

POST http://philter:8080/api/filter
This method receives text and removes sensitive information from the text based on the specified filter profile. The text to process must be provided in the body of the request.
Query Parameters d – string – A document ID that uniquely identifies the text being submitted. Leave empty and Philter will generate a document ID derived from a hash of the submitted text.

p – string – The name of a filter profile to use for filtering. Defaults to default is not provided.

c – string – The filtering context. Defaults to none if not provided.

Headers Content-Type – string The value should be set to text/plain or application/pdf.
Responses 200 – The response will consist of the plain-text filtered text. The response will have an x-document-id header that contains a document identifier which may have been assigned by Philter if none was provided.
Example Requests: curl -k https://localhost:8080/api/filter -d @file.txt -H “Content-Type: text/plain”

curl -k -X POST “https://localhost:8080/api/filter?” -d @file.pdf -H Content-Type “application/pdf” -O redacted.zip

Filter Text with Explanation

POST http://philter:8080/api/explain
This method receives text and removes sensitive information from the text based on the specified filter profile. The text to be filtered must be provided in the body of the request. This method is identical to /api/filter except the response contains detailed information on how the text was processed. This method can be useful in understanding why (or why not) sensitive information was filtered.
Query Parameters d – string – A document ID that uniquely identifies the text being submitted. Leave empty and Philter will generate a document ID derived from a hash of the submitted text.

p – string – The name of a filter profile to use for filtering. Defaults to default is not provided.

c – string – The filtering context. Defaults to none if not provided.

Headers Content-Type – string The value should be set to text/plain or application/pdf.
Responses 200 – The response contains information about the sensitive information that was found. The appliedSpans section details the sensitive information that was manipulated by Philter. Sensitive information that was identified but not manipulated will be listed under ignoredSpans. A span will be ignored if it fails to meet some condition, is explicitly ignored, or is a smaller part of a larger span.
Example Requests: curl -k https://localhost:8080/api/filter -d @file.txt -H “Content-Type: text/plain”

curl -k -X POST “https://localhost:8080/api/filter?” -d @file.pdf -H Content-Type “application/pdf” -O redacted.zip

Status of Philter

GET http://philter:8080/api/explain
This API method returns the status of Philter. This endpoint is well-suited for use by an external application monitoring system to periodically check the status of Philter, such as for health checks when deployed behind an AWS EC2 load balancer.There are no parameters required for this request.
Parameters None
Responses 200 – Indicates that Philter is healthy and prepared to accept requests.

503 – Indicates Philter is either currently initializing or has encountered an error. If the message persists, check Philter’s log for more information.

Example Request: curl -k https://localhost:8080/api/status