Some First Steps for a New NiFi Cluster

After installing Apache NiFi there are a few steps you might want to take before making your cluster available for prime time. None of these steps are required so make sure they are appropriate for your use-case before implementing them.

Lowering NiFi’s Log File Retention Properties

By default, Apache NiFi’s nifi-app.log files are capped at 100 MB per log file and NiFi retains 30 log files. If the maximum is reached that comes out to 3 GB of disk space from nifi-app.log files. That’s not a whole lot but in some cases you may need the extra disk space. Or, if an external service is already managing your NiFi log files you don’t need them hanging around any longer than necessary. To lower the thresholds open NiFi’s conf/logback.xml file. Under the appender configuration for nifi-app.log you will see a maxFileSize and maxHistory values. Lower these values, save the file, and restart NiFi to save disk space on log files. Conversely, if you want to keep more log files just increase those limits.

You can also make changes to the handling of the nifi-user.log and nifi-bootstrap.log files here, too. But those files typically don’t grow as fast as the nifi-app.log so they can often be left as-is. Note that in a cluster you will need to make these changes on each node.

Install NiFi as a Service

Having NiFi run as a service allows it to automatically start when the system starts and provides easier access for starting and stopping NiFi. Note that in a cluster you will need to make these changes on each node. To install NiFi as a system service, go to NiFi’s bin/ directory and run the following commands (on Ubuntu):

sudo ./nifi.sh install
sudo update-rc.d nifi defaults

You can now control the NiFi service with the commands:

sudo systemctl status nifi
sudo systemctl start nifi
sudo systemctl stop nifi
sudo systemctl restart nifi

If running NiFi in a container add the install commands to your Dockerfile.

Install (and use!) the NiFi Registry

The NiFi Registry provides the ability to put your flows under source control. It has quickly become an invaluable tool for NiFi. The NiFi Registry should be installed outside of your cluster but accessible to your cluster. The NiFi Registry Documentation contains instructions on how to install it, create buckets, and connect it to your NiFi cluster.

By default the NiFi Registry listens on port 18080 so be sure your firewall rules allow for the communication. Remember, you only need a single installation of the NiFi Registry per NiFi cluster. If you are using infrastructure-as-code to deploy your NiFi cluster make sure the scripts to deploy the NiFi Registry are outside the cluster scripts. You don’t want the NiFi Registry’s lifecycle to be tied to the NiFI cluster’s lifecycle. This allows you to create and teardown NiFi clusters without affecting your NiFi Registry. It also allows you to share your NiFi registry between multiple clusters if you need to.

Although using the NiFi Registry is not required to make a data flow in NiFI your life will be much, much easier if you do use the NiFi Registry, especially in an environment where multiple users will be manipulating the data flow.

Need more help?

We provide consulting services around AWS and big-data tools like Apache NiFi. Get in touch by sending us a message. We look forward to hearing from you!