Tuesday 4 November 2008

Hello, Nagios\n

Nagios is a generic monitoring tool, and I want to use it to monitor my own application. It has a lot of built-in and offical plugins. However, it is always good to try a Hello, World-type example.

Firstly, my very interesting application. It follows the Nagios Plugin guidelines: return a value in {0, 1, 2, 3} and a status message of less than 4k to standard out. And it gives you some random sysadmin-action:

(np.c, compiles into an executable np)

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int main(int argc, char* argv[]) {
srand(time(NULL));

int r = rand() % 10;

int status = -1;
char* message = NULL;

switch(r){
case 0:
case 1:
case 2:
case 3:
status = 0;
message = "Everything is alright :-)";
break;
case 4:
case 5:
case 6:
status = 1;
message = "You have been warned.";
break;
case 7:
case 8:
status = 2;
message = "She is gonna blow!";
break;
case 9:
status = 3;
message = "Oh shit...";
break;
}

printf("%s\n", message);

return status;
}


Nagios is easy to install with this quickstart guide. You just need to configure it:

1) In $NAGIOS_HOME/etc/nagios.cfg, add a line cfg_file=$NAGIOS_HOME/etc/objects/np.cfg

2) In $NAGIOS_HOME/etc/nagios.cfg, set interval_length to a suitably low number, e.g. 5 seconds, so you can see changes quickly

3) In $NAGIOS_HOME/etc/objects/np.cfg, make it look something like this:

define hostgroup {
hostgroup_name app_hosts
alias Application Hosts
}

define host {
use generic-host
host_name app_host
alias Application Host
address localhost
hostgroups app_hosts
max_check_attempts 10
}

define service {
use local-service
host_name app_host
service_description My Application
check_command monitor_np
normal_check_interval 1
retry_check_interval 1
}

define command {
command_name monitor_np
command_line $APPLICATION_HOME/bin/np
}

The configuration is quite fidgety. Lots of required attributes, no sensible defaults. Luckily you can reuse example configuration that comes with Nagios. And there is a handy pre-flight-check feature, you just run /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg and it will give you errors and warnings against your configuration files, with line numbers.

Anyway, by now Bob is your uncle and you can restart nagios to pick up changes. Then point your browser to http://localhost/nagios and watch the action!

No comments: