Nagios Architecture

January 24th, 2010 by Manoj Chauhan Leave a reply »

Overview 

Nagios is a host and service monitor designed to inform you of network problems before your clients, end-users or managers do. It has been designed to run under the Linux operating system, but works fine under most *NIX variants as well. The monitoring daemon runs intermittent checks on hosts and services you specify using external “plugins” which return status information to Nagios. When problems are encountered, the daemon can send notifications out to administrative contacts in a variety of different ways (email, instant message, SMS, etc.). Current status information, historical logs, and reports can all be accessed via a web browser.

  Architecture

Nagios is built on a server/agents architecture. Usually, on a network, a Nagios server is running on a host, and plugins are running on all the remote hosts that need to be monitored. These plugins send information to the server, which displays them in a GUI.

nagios

Nagios is composed of three parts: 

1) A scheduler: this is the server part of Nagios. At regular interval, the scheduler checks the plugins, and according to their results do some actions.

2) A GUI: the interface of Nagios (with the configuration, the alerts, …). It is displayed in web pages generated by CGI.It can be state buttons (green,OK/red,Error), sounds, MRTG graphs, …

3) The plugins. They are configurable by the user. They check a service and return a result to the Nagios server.

 A soft alert is raised when a plugin returns a warning or an error. Then on the GUI, a green button turns to red, and a sound is emitted. When this soft alert is raised many times (the number is configurable), a hard alert is raised, and the Nagios server sends notifications: email, SMS… 

Nagios Architecture (internal)
nagios

nagios
 
 

 

Nagios functionalities

 Nagios® is an open source tool specially developed to monitor host and service and designed to inform you of network incidents before your clients, end-users or managers do. It has been designed to run under the Linux operating system, but works fine under most *NIX variants as well initially developed for servers and application monitoring, it is now widely used to monitor networks availability. It is possible with the development of specific plugins around Nagios process. Nagios works with a set of “plugins” to provide local and remote service status. The monitoring daemon runs intermittent checks on hosts and services you specify using external “plugins” which return status information to Nagios.  When incidents are detected, the daemon send notifications out to administrative contacts in a variety of different ways (email, instant message, SMS, etc.). Current status information, historical logs, and reports can all be accessed via a Web browser.Custom “plugins” are relatively easy to develop Different methods are provided for remote resource discovery Nagios is freely available from http://www.nagios.org

Requirements

Other things you will need to get Nagios working are:

1) Nagios Plugins (from Nagios download URL)

2) GD – Graphics Libraries

3) JPEG Lib Sources

4) PNG Lib Sources

5) FPing (Fast Ping), this is optional but useful.

6) For SNMP monitoring you will need:

7) net-snmp-tools, and

8 )  net-snmp-utils

9) MySQL database for storing: Elements status logs

Plugins and Extensions

Developments on Nagios can be found at http://www.nagiosexchange.org/

Add-On projects are freely available. They cover subjects on:

1) Charts,

2) Communications,

3) Configuration,

4) Development,

5) Downtimes,

6) FrontEnds,

7) Notifications,

8 )  Misc

Plugins have been developed on:

1) Networking,

2) SNMP,

3) Hardware,

4) Linux,

5) Solaris,

6) Windows, … 

PLUGINS
 
 
 

 

1) A plugin is a small program (in Perl, C, java, python …) that checks a service (a daemon, some free space on a disk …). It must return a value and a small line of text (Nagios will only grab the first line of text). Output should be in the format: METRIC STATUS: information text performance data The allowed METRIC STATUS are 0 (OK), 1 (WARNING), 2 (CRITICAL) or 3 (UNKNOWN) 

2) The warning and critical thresholds are parameters, set by the user, passed as arguments to the plugin.

 3) A plugin can also return performance data in the format: “label1=value1 label2=value2 …”
These data are stored by Nagios and may be later displayed with MRTG (http://people.ee.ethz.ch/~oetiker/webtools/mrtg/)

The plugins can be run:
1) Locally, on the Nagios server. But such a plugin can check remote hosts, for example check_ping which pings remote hosts to check if they are running.
 
 

 

2) Remotely, through a remote Nagios server, with ssh, with snmp, with NRPE (Nagios Remote Plugin Executor), or with NSCA (Nagios Service Check Acceptor). It means that the plugin either waits for a verification request from the Nagios server before sending its result, or executes itself and sends the result to the Nagios server. 

Other useful developments

 Alarm resiliency

1) Nagios gives an immediate status of the monitored elements, it has no memory (except in log). It is useful to keep trace of an  incident until it has been checked and acknowledged by an operator.

 Network Interfaces discovery

1) Within big networks, it is useful to « compare » real configuration with database configuration. An external program can check every day (auto-discovery) the real network configuration versus Nagios database.
2) If differences appear, notify network administrator of the change.

 

Semi-automatic configuration
1) For each new element, multiple identified checking have to be configured and started
2) Semi-automatic configuration tool will write Nagios configuration files based on higher level network description files
 
 

 

References

1) Nagios source program
 http://www.nagios.org/download/

2) Nagios Extra developments
 http://www.nagiosexchange.org/

3) Official plugins
 http://nagiosplug.sourceforge.net/

4) Conferences
 http://www.nagios.org/propaganda/conferences/

Advertisement
blog comments powered by Disqus