On May 29, 2007, at 4:24 PM, Tony Mountifield wrote:
I have a small number of boxes in different locations, and
currently have
a fairly crude cron job running on each, which does a ping of one
or more
of the other boxes, and if the ping fails, it emails me to say the
other
box might be down. It then emails me again the next time the other box
appears to be up.
Of course, this can't distinguish between the remote box really
being down
and there being a network problem somewhere between the local and
remote
boxes.
I've been mulling over the idea of a more sophisticated scheme, where
a number of boxes send each other messages, indicating not only their
presence, but which other boxes they believe to be up. Then if a box
goes down, the other boxes all see it has gone and agree that it
really
is down. However, if there is instead a network outage or routing flap
so that a box is reachable from some places but not all, it might be
possible to distinguish this case.
So my question is: does anyone know of an existing too that does this
sort of thing?
Cheers
Tony
Nagios does this... although it can be a bit much to configure. And
what you're particularly looking for seems to be "dependency"
support, ie If your gateway is down, you don't want to be notified
that every server you have to connect through that gateway is also down.
A nice basic tutorial for Nagios I found is at:
http://www2.maxsworld.org/howtos/nagios.html
It doesn't delve on dependencies too much, but it shouldn't be that
difficult.
dex
----------
Mobile: +63 (917) 5357191, Office: +63 (2) 6312718
i4 Asia Incorporated - http://www.i4asiacorp.com/
_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos