Monitoring Apache NiFi’s Logs with AWS CloudWatch

It’s inevitable that at some point while running Apache NiFi on a single node or as a cluster you will want to see what’s in NiFi’s log and maybe even be alerted when certain logged events are found. Maybe you are debugging your own processor or just looking for more insight into your data flow. With the AWS CloudWatch Logs agent we can send NiFi’s log files to CloudWatch for aggregation, storage, and alerting.

Creating an IAM Role and Policy

The first thing we will do is install the CloudWatch Logs Agent. (We’ll mostly be following this Quick Start.) Because permissions are required to save the logs, we will create a new IAM role for our NiFi instances in EC2. (If your NiFi instances already have an existing role attached you can just edit that role.) After creating a new role, add a new JSON policy to it:

For copy/paste ease, the policy is:

{
   "Version":"2012-10-17",
   "Statement":[
      {
         "Effect":"Allow",
         "Action":[
            "logs:CreateLogGroup",
            "logs:CreateLogStream",
            "logs:PutLogEvents",
            "logs:DescribeLogStreams"
         ],
         "Resource":[
            "arn:aws:logs:*:*:*"
         ]
      }
   ]
}

Click the Review Policy and give the policy a name, like cloud-watch-logs, and click Create Policy. This policy can now be attached to an IAM role. Click through and give your role a name, such as nifi-instance-role and click Create Role. Now we can attach this role to our NiFi instances.

Install CloudWatch Logs Agent

Now that our NiFi EC2 instances have access to store the logs in CloudWatch Logs we can install the CloudWatch Logs agent on the instance. Because we are running Ubuntu and not Amazon Linux we’ll install the agent manually.

curl https://s3.amazonaws.com/aws-cloudwatch/downloads/latest/awslogs-agent-setup.py -O

sudo python ./awslogs-agent-setup.py --region us-east-1

If it gives you an error that the command python cannot be found you probably don’t have python (2) installed. You can quickly install it:

sudo apt-get install python

When prompted for an AWS Access Key ID and AWS Secret Access Key press enter to skip both. If you’re instances are running in a region other than us-east-1 enter it now. Press enter to skip the default output format. The next prompt asks the location of the syslog. You can press enter to accept the default of /var/log/syslog for both prompts. For the log stream name I recommend using the EC2 instance id which is the default option. Next, select the log event timestamp format. Again, the first option is recommended to press enter to accept it or make a different selection. Next, the agent asks where to start uploading. The first option will get the whole log file while the second option will just start at the end of the file. For completeness, I recommend the first option so press enter.

When asked if there are more log files to configure press enter for yes. Now we will specific NiFi’s application log. Our NiFi is installed at /opt/nifi/ so replace /opt/nifi/ with your NiFi directory in the responses below.

Path of log file to upload: /opt/nifi/logs/nifi-app.log
Destination Log Group Name: /opt/nifi/logs/nifi-app.log
Choose Log Stream name: 1. Use EC2 instance id
Choose Log Event timestamp format: 1. %b %d %H:%M:%S (Dec 31 23:59:59)
Choose initial position of upload: 1. From start of file.

Repeat these steps to add any other log files such as nifi-bootstrap.log and nifi-user.log. For convenience, the relevant contents of my /var/awslogs/etc/awslogs.conf file is below:

datetime_format = %b %d %H:%M:%S
file = /var/log/syslog
buffer_duration = 5000
log_stream_name = {instance_id}
initial_position = start_of_file
log_group_name = /var/log/syslog

[/opt/nifi/logs/nifi-app.log]
datetime_format = %b %d %H:%M:%S
file = /opt/nifi/logs/nifi-app.log
buffer_duration = 5000
log_stream_name = {instance_id}
initial_position = start_of_file
log_group_name = /opt/nifi/logs/nifi-app.log

[/opt/nifi/logs/nifi-bootstrap.log]
datetime_format = %b %d %H:%M:%S
file = /opt/nifi/logs/nifi-bootstrap.log
buffer_duration = 5000
log_stream_name = {instance_id}
initial_position = start_of_file
log_group_name = /opt/nifi/logs/nifi-bootstrap.log

[/opt/nifi/logs/nifi-user.log]
datetime_format = %b %d %H:%M:%S
file = /opt/nifi/logs/nifi-user.log
buffer_duration = 5000
log_stream_name = {instance_id}
initial_position = start_of_file
log_group_name = /opt/nifi/logs/nifi-user.log

After making manual changes to this file be sure to restart the CloudWatch Logs Agent service.

sudo service awslogs restart

With the service configured and restarted it will now be sending logs to CloudWatch Logs.

Checkout the Logs!

Navigating back to the AWS Console and going to CloudWatch we can now see our NiFi logs under the Logs section.

Because we selected the EC2 instance ID as the log_stream_name the logs will be grouped by instance ID. It may be more convenient for you to use a hostname instead of the instance ID.

By having all of our NiFi logs aggregated in a single place we no longer have to SSH into each host to look at the log files!

Create Custom Log Filter

We can also now create custom filters on the logs. For example, to quickly just see any error messages we can create a new Logs Metric Filter with the Filter Pattern ERROR. This will create a metric for lines that contain the filter. If you want the filter to look for something more specific you can adjust the Filter Pattern as needed.

Click the Assign Metric button to continue.

Now we can name our filter and assign it a value. Click Create Filter. Now we have our metric filter!

With this filter we can create alarms to watch for static thresholds or anomalies. For example, if more than two ERROR messages are found in the log in a period of 5 minutes generate an alarm. We can utilize CloudWatch’s anomaly detection instead of static values. In this case, CloudWatch will monitor the standard deviation and generate an alarm when the condition threshold is met.

Monitoring for ERROR messages in the log is a useful, even if trivial, example but I think it shows the value in utilizing CloudWatch Logs to capture NiFi’s logs and building custom metrics and alarms on them.

Need more help?

We provide consulting services around AWS and big-data tools like Apache NiFi. Get in touch by sending us a message. We look forward to hearing from you!

Posted by / July 30, 2019