A Philter Streaming Quick Start
This article describes how Philter can be used on streaming text. In this article we will (quickly!) deploy a simple Apache Kafka cluster, populate a topic with sample text, and consume the streaming text using Philter. This guide is not written for any single cloud environment and is intended to be agnostic of the actual deployment environment.
Get and Start Kafka
To begin we will download Kafka 2.2.0. Once downloaded we will extract it and change into its directory.
gunzip -c kafka_2.12-2.2.0.tgz | tar xvf - cd kafka_2.12-2.2.0
We will now start ZooKeeper (used for Kafka cluster management):
Now we can start Kafka:
Kafka and ZooKeeper should both be running now.
Create a Kafka Topic
With Kafka and ZooKeeper running, we can create a new topic to hold text that we want to process with Philter. We will call our topic “ingest.”
bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic ingest
Push Text to the Kafka Topic
Now we can push text to our new “ingest” Kafka topic.
cat file.txt | bin/kafka-console-producer.sh --broker-list localhost:9092 --topic ingest
This command sends the contents of file.txt to the “ingest” topic. The text can now be processed by Philter.