Philter offers two modes of operation for filtering text. The first is via Philter’s REST API. This method is useful for batch processing scenarios or when Philter needs to be integrated into an existing process or workflow. The second mode is performing filtering on streaming text. This mode utilizes Apache Kafka to process streaming text.
Filtering via the REST API
This method of filtering allows you to submit text to Philter to be filtered. The response is the filtered text. This method is the most flexible because it allows Philter to be integrated with virtually any existing systems or processes. However, its performance will likely not be as good as the streaming method described below.
Filtering via Streaming
This method of filtering allows Philter to subscribe to an Apache Kafka topic. Philter consumes text from the topic, filters it, and places the filtered text back onto Apache Kafka in a different topic. Philter’s streaming is designed to run on a YARN cluster.
This method is most performant but it is more restricted than filtering via the REST API. When streaming, each Kafka topic is treated as its own context, contrasted to the REST API in which each request to filter can have its context set individually.