A Philter Streaming Quick Start

This article describes how Philter can be used on streaming text. In this article we will (quickly!) deploy a simple Apache Kafka cluster, populate a topic with sample text, and consume the streaming text using Philter. This guide is not written for any single cloud environment and is intended to be agnostic of the actual deployment environment.

Since this is a “quick start” guide these instructions are intended to demonstrate Philter’s capabilities in a demonstration or test environment. This guide is not intended to be used to deploy Philter in a production environment.

Get and Start Kafka

To begin we will download Kafka 2.2.0. Once downloaded we will extract it and change into its directory.

gunzip -c kafka_2.12-2.2.0.tgz | tar xvf -
cd kafka_2.12-2.2.0

We will now start ZooKeeper (used for Kafka cluster management):

bin/zookeeper-server-start.sh config/zookeeper.properties

Now we can start Kafka:

bin/kafka-server-start.sh config/server.properties

Kafka and ZooKeeper should both be running now.

Create a Kafka Topic

With Kafka and ZooKeeper running, we can create a new topic to hold text that we want to process with Philter. We will call our topic “ingest.”

bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic ingest

Push Text to the Kafka Topic

Now we can push text to our new “ingest” Kafka topic.

cat file.txt | bin/kafka-console-producer.sh --broker-list localhost:9092 --topic ingest

This command sends the contents of file.txt to the “ingest” topic. The text can now be processed by Philter.

Consume the Text using Philter