Monitoring Apache NiFi with Datadog

One of the most common requirements when using Apache NiFi is a means to adequately monitor the NiFi cluster. Insights into a NiFi cluster’s use of memory, disk space, CPU, and NiFi-level metrics are crucial to operating and optimizing data flows. NiFi’s Reporting Tasks provide the capability to publish metrics to external services.

Datadog is a hosted service for collecting, visualizing, and alerting on metrics. With Apache NiFi’s built-in DataDogReportingTask, we can leverage Datadog to monitor our NiFi instances. In this blog post we are running a 3 node NiFi cluster in Amazon EC2. Each node is on its own EC2 instance.

Note that the intention of this blog post is not to promote Datadog but instead to demonstrate one potential platform for monitoring Apache NiFi.

If you don’t already have a Datadog account the first thing to do is to create one. Once done, the first thing you can do is install the Datadog Agent on your NiFi hosts. The command will look similar to the following except it will have your API key. In the command below we are installing the Datadog agent on an Ubuntu instance. If you just want to monitor NiFi-level metrics you can skip this step, however, we find the host-level metrics to be valuable as well.

DD_API_KEY=xxxxxxxxxxxxxxx bash -c "$(curl -L https://raw.githubusercontent.com/DataDog/datadog-agent/master/cmd/agent/install_script.sh)"

This command will download and install the Datadog agent on the system. The Datadog service will be automatically started and run automatically upon system start.

Creating a Reporting Task

The next step is to create a DataDogReportingTask in NiFi. In NiFi’s Controller Settings under Reporting Tasks, click to add a new Reporting Task and select Datadog. In the Reporting Tasks’ settings, enter your Datadog API key and change the other values as desired.

By default, NiFi reporting Tasks run every 5 minutes by default. You can change this period under the Settings tab under the “Run Schedule” if needed. Click Apply to save the reporting task.

The reporting task will now be listed. Click the Play icon to start the reporting task. Apache NiFi will now send metrics to Datadog every 5 minutes (unless you changed the Run Schedule value to a different interval).

Exploring NiFi Metrics in Datadog

We can now go to Datadog and explore the metrics from NiFi. Open the Metrics Explorer and enter “nifi” (without the quotes) and the available NiFi metrics will be displayed. These metrics can be included in graphs and other visuals in Datadog dashboards. (If you’re interested, the names of the metrics originate in MetricNames.java.)

Creating a Datadog Dashboard for Apache NiFi

These metrics can be added to Datadog dashboards. By creating a new dashboard, we can add NiFi metrics to it. For example, in the dashboard shown below we added line graphs to show the CPU usage and JVM heap usage for each of the NiFi nodes.

The DataDogReportingTask provides a convenient but powerful method of publishing Apache NiFi metrics. The Datadog dashboards can be configured to provide a comprehensive look into the performance of your Apache NiFi cluster.

What we have shown here is really the tip of the iceberg for making a comprehensive monitoring dashboard. With NiFi’s metrics and Datadog’s flexibility, how the dashboard is created is completely up to you and your needs.

Need more help?

We provide consulting services around AWS and big-data tools like Apache NiFi. Get in touch by sending us a message. We look forward to hearing from you!