Using Nagios Core and NRPE to monitor remote linux hosts
Prerequisites:
This tutorial requires existing Nagios server to be up and running and root privileges for Nagios server and remote Linux host. Please follow this blog for its setup.
Most of the system administrators write custom shell scripts to do basic monitoring and sends email in case services crosses defined thresholds, but those scripts can’t provide complete insights of the host being monitored. To achieve this you will need to deploy a robust monitoring system that can provide features like Comprehensive Monitoring, Notification System, Reporting, Escalations, Event Handlers for Automation.
I am writing this blog for continuing adding remote Linux hosts for monitoring in the Nagios server using NRPE daemon. This blog doesn’t include monitoring for Windows host or Network devices.
What is NRPE?
NRPE is a Nagios Remote Plugin Executor, it is used for monitoring remote Linux host services like CPU Load, Current Users, Disk Utilization, Swap Utilization, Memory Utilization. For monitoring, remote hosts, we need to install NRPE agent and update ‘/etc/nagios/nrpe.cfg’ file to allow access to Nagios server on port 5666. By default communication between Nagios server and NRPE agent is secure and uses the SSL tunnel for communication.
NRPE Checks
a. Direct checks: Nagios server directly checks, remote Linux host for local resources like load, disk, swap usage etc. (as shown above)
b. Indirect checks: If Nagios server can not access target remote Linux host, then we can install NRPE on other Linux host and use it to check remote services. (as shown above)
Install NRPE server on Linux Host and NRPE Plugin on Nagios server
1. On Linux Host (Ubuntu)
First of all we will list down nagios-nrpe-server packages available, next install nagios-nrpe-server package as shown below and to check details about the package use the dpkg command as shown below:
[js]aptitude search nagios-nrpe-server
aptitude install nagios-nrpe-server
dpkg -L nagios-nrpe-server[/js]
2. On Nagios server (Ubuntu)
As mentioned above, install nagios-nrpe-plugin on Nagios server, it will install nrpe command on the Nagios server.
[js]aptitude search nagios-nrpe-plugin
aptitude install nagios-nrpe-plugin
dpkg -L nagios-nrpe-plugin[/js]
In order to allow access to Nagios server, add a Nagios server IP to the allowed host (line no 81) in nrpe.cfg located in /etc/nagios/nrpe.cfg. After making changes, restart nagios-nrpe-server on remote Linux host to update changes.
[js]service nagios-nrpe-server reload[/js]
To validate changes, manually run check_nrpe command from the Nagios server with an IP of remote Linux host to check the version of NRPE installed on a remote Linux host. Commands to try:
[js]/usr/lib/nagios/plugins/check_nrpe -H 10.1.13.249 [/js]
[js]/usr/lib/nagios/plugins/check_nrpe -H 10.1.13.249 -c check_load [/js]
Note: H argument is for specifying Host and c for specifying command to be executed on remote Linux host
Example of NRPE Direct check:
By default nrpe.cfg configurations check disk using check_hda1. In my case drive name is /dev/sda1. So copy and paste above line and update command with the correct drive name(in my case /dev/sda1) and comment out the old line.
[js] command[check_sda1]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /dev/sda1 [/js]
Note: In the above command arguments w is for warning, c is for critical and p is for partition, we will get warning alert if free space left reaches 20% and critical alert if it reaches 10%.
To execute nrpe commands using check_nrpe_1arg and “!” is used for passing arguments and in an argument we are passing check_sda1 (command defined on remote Linux host nrpe.cfg to check disk utilization)
After making all the changes, restart Nagios service to make the changes effective.
Example of NRPE In-direct check:
We can monitor www.google.com URL directly from the Nagios server using check_http, but in order to demonstrate nrpe indirect check I will be monitoring URL from another Linux host as shown below. For this we will update nrpe.cfg on remote Linux host and add new command check_http_indirect.
Once you update nrpe.cfg, try to run the command from terminal to validate we are getting desired output:
After checking manually, add indirect check configuration to the demo.cfg and restart Nagios3 daemon to update changes.
Based upon the check interval defined, in my case 2 minutes, check Nagios3 GUI for updated statistics:
So this is how you can create multiple checks. In coming posts I will demonstrate how to enable custom checks using Nagios NRPE.