Overview
Nagios is a host and service monitor designed to inform you of network problems before your clients, end-users or managers do. It has been designed to run under the Linux operating system, but works fine under most *NIX variants as well. The monitoring daemon runs intermittent checks on hosts and services you specify using external “plugins” which return status information to Nagios. When problems are encountered, the daemon can send notifications out to administrative contacts in a variety of different ways (email, instant message, SMS, etc.). Current status information, historical logs, and reports can all be accessed via a web browser.
Architecture
Nagios is built on a server/agents architecture. Usually, on a network, a Nagios server is running on a host, and plugins are running on all the remote hosts that need to be monitored. These plugins send information to the server, which displays them in a GUI.
Nagios is composed of three parts:
1) A scheduler: this is the server part of Nagios. At regular interval, the scheduler checks the plugins, and according to their results do some actions.
2) A GUI: the interface of Nagios (with the configuration, the alerts, …). It is displayed in web pages generated by CGI.It can be state buttons (green,OK/red,Error), sounds, MRTG graphs, …
3) The plugins. They are configurable by the user. They check a service and return a result to the Nagios server.
A soft alert is raised when a plugin returns a warning or an error. Then on the GUI, a green button turns to red, and a sound is emitted. When this soft alert is raised many times (the number is configurable), a hard alert is raised, and the Nagios server sends notifications: email, SMS…
Nagios functionalities
Nagios® is an open source tool specially developed to monitor host and service and designed to inform you of network incidents before your clients, end-users or managers do. It has been designed to run under the Linux operating system, but works fine under most *NIX variants as well initially developed for servers and application monitoring, it is now widely used to monitor networks availability. It is possible with the development of specific plugins around Nagios process. Nagios works with a set of “plugins” to provide local and remote service status. The monitoring daemon runs intermittent checks on hosts and services you specify using external “plugins” which return status information to Nagios. When incidents are detected, the daemon send notifications out to administrative contacts in a variety of different ways (email, instant message, SMS, etc.). Current status information, historical logs, and reports can all be accessed via a Web browser.Custom “plugins” are relatively easy to develop Different methods are provided for remote resource discovery Nagios is freely available from http://www.nagios.org
Requirements
Other things you will need to get Nagios working are:
1) Nagios Plugins (from Nagios download URL)
2) GD – Graphics Libraries
3) JPEG Lib Sources
4) PNG Lib Sources
5) FPing (Fast Ping), this is optional but useful.
6) For SNMP monitoring you will need:
7) net-snmp-tools, and
8 ) net-snmp-utils
9) MySQL database for storing: Elements status logs
Plugins and Extensions
Developments on Nagios can be found at http://www.nagiosexchange.org/
Add-On projects are freely available. They cover subjects on:
1) Charts,
2) Communications,
3) Configuration,
4) Development,
5) Downtimes,
6) FrontEnds,
7) Notifications,
8 ) Misc
Plugins have been developed on:
1) Networking,
2) SNMP,
3) Hardware,
4) Linux,
5) Solaris,
6) Windows, …
1) A plugin is a small program (in Perl, C, java, python …) that checks a service (a daemon, some free space on a disk …). It must return a value and a small line of text (Nagios will only grab the first line of text). Output should be in the format: METRIC STATUS: information text performance data The allowed METRIC STATUS are 0 (OK), 1 (WARNING), 2 (CRITICAL) or 3 (UNKNOWN)
2) The warning and critical thresholds are parameters, set by the user, passed as arguments to the plugin.
3) A plugin can also return performance data in the format: “label1=value1 label2=value2 …”
These data are stored by Nagios and may be later displayed with MRTG (http://people.ee.ethz.ch/~oetiker/webtools/mrtg/)
2) Remotely, through a remote Nagios server, with ssh, with snmp, with NRPE (Nagios Remote Plugin Executor), or with NSCA (Nagios Service Check Acceptor). It means that the plugin either waits for a verification request from the Nagios server before sending its result, or executes itself and sends the result to the Nagios server.
Other useful developments
Alarm resiliency
1) Nagios gives an immediate status of the monitored elements, it has no memory (except in log). It is useful to keep trace of an incident until it has been checked and acknowledged by an operator.
Network Interfaces discovery
1) Within big networks, it is useful to « compare » real configuration with database configuration. An external program can check every day (auto-discovery) the real network configuration versus Nagios database.
2) If differences appear, notify network administrator of the change.
2) Semi-automatic configuration tool will write Nagios configuration files based on higher level network description files
References
1) Nagios source program
http://www.nagios.org/download/
2) Nagios Extra developments
http://www.nagiosexchange.org/
3) Official plugins
http://nagiosplug.sourceforge.net/
4) Conferences
http://www.nagios.org/propaganda/conferences/
