Hi everyone, let me share a few thoughts on something I discussed with some of you on IRC (johnmark and jdarcy, iirc) a while back, and only now got to fix up in a submittable shape. review.gluster.com/3043 has all the gory details. What we talked about several weeks ago was that there was really no way to automatically recover glusterfsd daemons when they failed, and that upstart and systemd integration hadn't been tackled yet. Of course, as it turns out we have a distributed monitoring and auto-recovery facility in the Pacemaker cluster resource manager (ubiquitous on any platform except RHEL, and even that is set to change for RHEL 7). So I decided to whip up a couple of resource agents (RAs) to plug into Pacemaker, which I'm humbly submitting for upstream inclusion. This would allow people to run GlusterFS services in a highly available fashion, and also make use of Pacemaker's dependency enforcement and monitoring facilities. It would make it easy, for example, for people to mount and monitor the availability of a filesystem that's being exported as a GlusterFS brick, and gracefully remove or recover the node in case its local I/O stack is acting up -- so clients can continue to talk to a different replica that is still doing fine. It would also allow people to build inter-dependencies in Pacemaker clusters, integrate load balancing with ldirectord, manage IP addresses, and lots of other things. All feedback is much appreciated. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now