Hi, I am interested in monitoring cluster running linux for various failures (hardware and software). I basically want to quantify the cluster for different failures over period of a month or so. For this purpose I need to periodically scan the syslogd and klogd messages to determine the failures. But the issue is that the volume of messages is quite large and I am not sure what I am exactly looking for. If ppl in the list could post some of the major error/panic/warning messages that I should parse for (to achieve my objective detailed above), I would be very glad. Thanks in advance, Pirabhu -- Kernelnewbies: Help each other learn about the Linux kernel. Archive: http://mail.nl.linux.org/kernelnewbies/ FAQ: http://kernelnewbies.org/faq/