Filtering PHI and PII

The filtering methods available depend on the edition of Philter being used. Both Philter Standard and Enterprise editions offer filtering via the API. This method is useful for batch processing scenarios or when Philter needs to be integrated into an existing process or workflow. The second mode, available only in the Enterprise edition, is performing filtering on streaming text. This mode utilizes Apache Kafka to process streaming text.

Methods of Filtering

Filtering via the REST API

This method of filtering allows you to submit text to Philter to be filtered. The response is the filtered text. This method is the most flexible because it allows Philter to be integrated with virtually any existing systems or processes. See the details of the API or the Quick Start for an example.

Filtering via Streaming

This method of filtering is only available in Philter Enterprise Edition.

This method of filtering allows Philter to subscribe to an Apache Kafka topic. Philter consumes text from the topic, filters it, and places the filtered text back onto Apache Kafka in a different topic. Philter’s streaming is designed to run on a YARN cluster.

This method is most performant but it is more restricted than filtering via the API. When streaming, each Kafka topic is treated as its own context, contrasted to the REST API in which each request to filter can have its context set individually.

When filtering via streaming you can choose the format of the messages that are published to the Kafka topic. The published messages can either simply be the filtered text or the published messages can be JSON. When JSON is chosen, the message structure will be as follows:

  "filteredText": "The filtered text will be here.",
  "context": "The context (incoming Kafka topic name) will be here.",
  "documentId": "The assigned document ID will be here."
Was this article helpful to you? Yes No

How can we help?