I wrote a script to do something similar. Here's a modified version that will verify working glusterfs mounts in general... All you need is a path that is the same on all nodes being checked and on the node performing the check, and passwordless ssh into the gluster client nodes. For testing I just made a tmp directory right inside the glusterfs mount and used that: #!/bin/bash check_node() { ??# ssh into the node and have it write its hostname into a temp file ??# in a gluster mounted directory. If we can read it from here and it's ??# correct, the node is online with 100% certainty ??SSH="ssh -q -l root -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o ConnectTimeout=5" ??# All nodes must have the same path to this directory ??TEMP_DIR=/cluster/tmp ??FILE=`mktemp -p $TEMP_DIR` ??$SSH $ip "hostname > $FILE" ??# For any ip addresses listed in /a_node_ip_list (list them one per line) ??# on line 27, we need to get the hostname ??# from /etc/hosts. Make sure it's in there ??if test "`grep $ip /etc/hosts | awk '{print $2}'`" == "`cat $FILE`" ?? then ?? ?echo "confirmed online" ?? else ?? ?echo "not online. Call someone!" ?? fi ?} echo echo "GlusterFS status:" echo for ip in `cat /a_node_ip_list` ?do ?? ? echo -n "checking $ip... ?" ?? ? check_node ?done # Clean up rm -rf $TEMP_DIR/tmp.* exit 0 ? > > This is an interesting topic indeed. > > I'm planning to have each server ping it's AFR pair, and if one of them > goes down, the moment it comes up, to run ls -lR on the mount. > > Perhaps others can share additional ideas? > > Regards. > > 2009/4/2 Cory Meyer < cory.meyer at gmail.com > > > > Has anyone found a decent way out there to monitor GlusterFS volumes? > > I'm currently using Nagios and Cacti to take care of basic CPU, Load, > > Memory, and raw Disk I/O. ? I need to monitor GlusterFS status and making > > sure all volumes are available.. > > > > My test environment is 6 servers with 6 AFR volumes which are each shared > > between those 2 servers. ?All volumes are mounted on each server. > > > > The checks I'm testing out so far include a simple Bash script that > > writes the current Unix timestamp and hostname to a file once a minute. > > This is done by each server on only the volumes that they store. > > ? ?echo "$(uname -n):$(date +%s)" > /mnt/gluster01/CHECK_FILE > > > > The Nagios NRPE daemon would then execute a Perl script on each of the > > clients. ? This script goes thorugh each of the Gluster mount points > > comparing the timestamps in the CHECK_FILE to the current system time > > alarming if the timestamp is off by more than a minute. ?Another test > > which hasn't been implimented was checking the contents of the CHECK_FILE > > ?with the data that is on the raw disk. > > > > Bash code to write timestamps and executed via cron once a minute. > > (write_timestamps.sh) > > http://glusterfs.pastebin.com/m5a220a6 > > > > Perl code to compare the timestamps which is executed on the client. > > (check_glusterfs_mounts.pl) > > http://glusterfs.pastebin.com/m2f057a77 > > > > Any ideas/questions/comments? > > > > > > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://zresearch.com/pipermail/gluster-users/attachments/20090403/dcd5845d/attachment.htm>