Getting started with Nagios
Monitoring lies at very heart of the Production application. Its absolutely necessary that each and every aspect of application is monitored be it application itself any helper services that run on the server like SSH, FTP, NFS etc and the server resources like CPU usage, disk usage. Even the slightest deviation from the regular usage pattern could indicate an potencial issue. And for those woring with tons of servers day in and day out a good monitoring tool proves to be a life saver.
Nagios is the extremely efficient monitoring tools available in market, and it helps an administrator in all imaginable ways possible. Getting started with Nagios is even easy. After installing Nagios on the monitoring server we can use the following steps to get started. (In this article I have used Nagios3 on Ubuntu 14.04)
1. Copy the file /etc/nagios3/apache.conf to apache’s sites-available directory and enable it to get a working Nagios’ application interface.
2. Create a folder named “nagios-conf” to keep all the custom Nagios configuration.
[code]mkdir -p /etc/nagios3/nagios-conf[/code]
3. Edit Nagios’s configuration file (/etc/nagios3/nagios.cfg) and make it read the directory that we just created (/etc/nagios3/nagios-conf) for the configurations, by adding an extra cfg_dir variable.
[code]echo "cfg_dir=/etc/nagios3/nagios-conf" >> /etc/nagios3/nagios.cfg[/code]
4. Create a .cfg file in nagios-conf directory in which we’ll add all our configurations. (let us name it sample.cfg )
[code]touch /etc/nagios3/nagios-conf/sample.cfg[/code]
5. Edit the sample.cfg file to add host block. A host block defines the target server that we are going to monitor. A simple host block defines following
define host {
use generic-host
host_name application-server
alias Application Server
address 54.169.2.68
contacts root
}
use generic-host : We are using a generic template which is already defined in Nagios’ config (/etc/nagios3/conf.d) directory. This template contains all the required variables that are required to define a host block.
host_name : It defines a unique name with which this host block would be reffered in Nagios’ configurations.
alias : It is used to specify a publically visible name for the host.
address : IP address of the target system, that needs to be monitored. Nagios server should be able to Ping the target system.
contacts : Root is a predefined contact, we’ll update this contact as we move furthur.
6. Check the configurations and restart Nagios if config test is successful. And an item “Application Server” appears in the host section defining its health and status
[code]sudo nagios3 -v /etc/nagios3/nagios.cfg<br />sudo service nagios3 restart[/code]
In the image above we see a host which is up and running, but no services on the host are being monitored as of now.
7 . Add a service block in the sample.cfg file, which will define the service we need to monitor in the target server.
define service{
use generic-service
host_name application-server
service_description Checking the Application
check_command check_http}
use generic-service : Specifies that we are using an existing service template
host_name application-server : Name of the host on which we we want to monitor this service. The “application-server” is the host that we defined in step 5.
service_description Checking the Application : Its the publically visible description of the service. It could be any text, ideally its best if it defines the nature of the check
check_command check_http : Predefined command that we are using. Here check_http will check the http resonse from host server.
8. Again check the config and restart Nagios with the commands mentioned in step 6. And you’ll now see a service being monitored on this server.
9. Till now we have identified a host and a service that is being monitored on that host. But we haven’t defined to whom it would send notification in case of service failure. For this we create a contact block that defines an contact who would notified in any such case.
define contact {
contact_name username
alias System Admin
email user@domain.com
host_notification_commands notify-host-by-email
host_notification_options d,u,r
host_notification_period 24×7
service_notification_commands notify-service-by-email
service_notification_options w,u,c,r
service_notification_period 24×7
}
And we’ll also need to add “contacts username” to service block to specify the contact/personal who would be notified on service disruption.
service_notification_commands notify-host-by-email : Defines the medium that would be used to notify contact on service disruption.
service_notification_options d,u,r : Defines the status for which contact would be notified, d means down, u is for unknown and r is for recovery
service_notification_period 24×7 : Defines time-period during when the contact would be notified.
10. Check config and restart Nagios again. And we are done with basic monitoring configurations. Now if the service/host fails the contact would be notified via Email. Try stopping the http server on target system and you shall be notified.
Very interesting blog post.Quite informative and very helpful.This indeed is one of the recommended blog for learners.Thank you for providing such nice piece of article. I’m glad to leave a comment. Expect more articles in future. You too can check this DevOps tutorial for updated knowledge on DevOps.https://www.youtube.com/watch?v=4AJoRkjm998
Pingback: Using Nagios Core and NRPE to monitor remote linux hosts | TO THE NEW Blog