Hi Geoff, Actually, I was thinking of something a bit simpler: /etc/ha.d/haresources: somehost \ IPaddr2::10.24.0.254/24/eth1/10.24.0.255 You could run the daemon on both machines at the same time. The only thing that would designate a host as "master" would be the failover IP which would be in this case "10.10.0.254". Majied Geoff Kassel wrote: > Hi Majied, > > >> With Heartbeat, the only thing required would be a "failover IP" handled >> between the two Heartbeat servers running the glusterfsd process using >> AFR. When a glusterfsd server running Heartbeat goes down, the other >> Heartbeat server would take over the failover IP and continue service to >> the glusterfs clients. >> > > I think I see what you're getting at here - haresources on both machines > something like: > > node1 192.168.0.1 glusterfsd > node2 192.168.0.2 glusterfsd > > Am I correct? > > I haven't had much luck with getting services running through Heartbeat in the > past (Gentoo's initscripts not being directly compatible with Heartbeat's > status exit level requirements), but looking at the glusterfsd initscript it > looks like it might work with Heartbeat. > > I'll give this a try, and see how I go. Thanks for the idea! > > > I know this may be a bit off topic for the list (if there was a gluster-users > list I'd move this there), but for anyone who was curious (I'm still curious > about a solution to this myself) I was trying to configure Heartbeat + > LDirectorD the following way: > > /etc/ha.d/haresources: > > node1 ldirectord::ldirectord.cf LVSSyncDaemonSwap::master \ > IPaddr2::192.168.0.3/24/eth0/192.168.0.255 > > # node2 is a live backup for failover on the Linux Virtual Server daemon > # if node1 goes down > > etc/ha.d/ldirectord.cf: > > checktimeout=1 > checkinterval=1 > autoreload=yes > logfile="/var/log/ldirectord.log" > quiescent=yes > > virtual=192.168.0.3:6996 > real=192.168.0.1:6996 gate 1 > real=192.168.0.2:6996 gate 1 > checktype=connect > scheduler=rr > protocol=tcp > > > The real glusterfsd servers are running on 192.168.0.1 (node1) and 192.168.0.2 > (node2), and clients connect to the virtual IP address, 192.168.0.3. > > However, ipvsadm after Heartbeat starts does not show either > connection as up, even though telnet to both real servers on port 6996 > connects. If I configure a fallback (say, to 127.0.0.1:6996), I only ever get > the fallback through 192.168.0.3, and if I stop that machine, any connections > through 192.168.0.3 stop too. > > Heartbeat doesn't see that as a failure condition - with or without the > fallback node - so the IP address and LVS won't fail over to the other node. > I can't see a way to configure Heartbeat to do so either. Hence my question > about finding a way to get LDirectorD to do the detection in a more robust > request-response manner. > > Majied, do you or any one else on the list have a suggestion for what I may > have missed? > > Thank you all in advance for any and all suggestions (including RTFM again :) > > Kind regards, > > Geoff Kassel. > > On Wed, 12 Sep 2007, Majied Najjar wrote: > >> Hi, >> >> >> This is just my two cents. :-) >> >> >> Instead of LDirectorD, I would recommend just using Heartbeat. >> >> >> With Heartbeat, the only thing required would be a "failover IP" handled >> between the two Heartbeat servers running the glusterfsd process using >> AFR. When a glusterfsd server running Heartbeat goes down, the other >> Heartbeat server would take over the failover IP and continue service to >> the glusterfs clients. >> >> >> Granted, this isn't loadbalancing between glusterfsd servers and only >> handles failover.... >> >> >> Majied Najjar >> >> Geoff Kassel wrote: >> >>> Hi all, >>> I'm trying to set up LDirectorD (through Heartbeat) to load-balance >>> and failover client connections to GlusterFS server instances over TCP. >>> >>> First of all, I'm curious to find out if anyone else has attempted >>> this, as I've had no luck with maintaining client continuity with >>> round-robin DNS in /etc/hosts and client timeouts, as advised in previous >>> posts and tutorials. The clients just go dead with 'Transport endpoint is >>> not connected' messages. >>> >>> My main problem is that LDirectorD doesn't seem to recognize that a >>> GlusterFS server is functional through the connection test method, so I >>> can't detect if a server goes down. While LDirectorD does a >>> request-response method of liveness detection, the GlusterFS protocol is >>> unfortunately too lengthy to use in the configuration files. (It needs to >>> be a request that can fit on a single line, it seems.) >>> >>> I'm wondering if there's a simple request-response connection test I >>> haven't found yet that I can use to check for liveness of a server over >>> TCP. If there isn't... could I make a feature request for such? Anything >>> that can be done manually over a telnet connection to the port would be >>> perfect. >>> >>> Thank you for GlusterFS, and thanks in advance for your time and >>> effort in answering my question. >>> >>> Kind regards, >>> >>> Geoff Kassel. >>> >>> >>> _______________________________________________ >>> Gluster-devel mailing list >>> Gluster-devel@xxxxxxxxxx >>> http://lists.nongnu.org/mailman/listinfo/gluster-devel >>> > >