Re: New wiki page for 2 server afr, client side afr

"Brandon Lamb" <brandonlamb@xxxxxxxxx> · Fri, 2 May 2008 08:51:22 -0700

On Fri, May 2, 2008 at 8:40 AM, Brandon Lamb <brandonlamb@xxxxxxxxx> wrote:
>
> On Fri, May 2, 2008 at 3:04 AM, Daniel Maher <dma+gluster@xxxxxxxxx> wrote:
> >
> > On Thu, 1 May 2008 14:47:39 -0700 "Brandon Lamb"
> > <brandonlamb@xxxxxxxxx> wrote:
> >
> > > http://www.gluster.org/docs/index.php/Setting_up_AFR_on_two_servers_with_client_side_replication
> > >
> > > Look over and make sure it is kosher?
> > >
> > > I added a section at the bottom for "gotchas", can you take a quick
> > > look to make sure they are accurate statements.
> >
> > From the wiki page :
> >
> > "As you can see the cluster came back. During the time that server2 is
> > down, the file system does not seem to be available. This does not bode
> > well when we need or want to be able to down one of the data servers
> > for whatever reason. Hence client side AFR is recommended over server
> > side."
> >
> > IMHO, the conclusion that you've drawn here is somewhat misleading.  By
> > using RRDNS to allocate a single hostname to all (both) nodes in the
> > AFR server cluster, the problem you're describing can be avoided
> > entirely.
> >
> > While i realise that your wiki articles are meant to be as simple as
> > possible (and, therefore, a discussion of RRDNS is out of scope), it
> > would be remiss to not (at least) link to further information on the
> > subject.
> >
> >
> > --
> > Daniel Maher <dma AT witbe.net>
>
> Actually from what I am seeing in my testing that would not be true.
> Also with RRDNS you would have dns ttl issues where you could be
> directed to a server that was down? I use LVS for all my major
> services so dont run into that problem.
>
> However, that aside, rrdns would not solve this problem. client1 was
> still connected to server1 which was up and running, but it could not
> read or write to existing files, although i was able to create a NEW
> file for some reason while server2 was down. So using rrdns to make
> the clients connect to a working server *so far from what I am seeing*
> would not solve anything. Yes it would get a client to connect to
> server1 which was up, but the cluster still doesnt work.
>
> I will be more doing more extensive testing today per request by
> Krishna with debugging on and updating wiki as needed. More to come!
> Hopefully I just have something goofy going on.
>
> NOTE TO KRISHNA:
> Could this have ANYTHING to do with the possiblity that I am using two
> network interfaces? Could it for *some* reason be getting confused
> that server1 and server2 are talking on 192.168.0 and my clients are
> talking to the servers on 208.200.248?
>
> I will test this out as well

A note on RRDNS, maybe my understanding is incorrect, can anyone comment?

My understanding is that if you have two A records say 192.168.0.1 and
192.168.0.2 for a name "servers.mycluster.tld" that client1 will do a
dns lookup for servers.mycluster.tld and get an answer of

;; ANSWER SECTION:
servers.mycluster.tld.    3600    IN      A       192.168.0.1
servers.mycluster.tld.    3600    IN      A       192.168.0.2

and it may get those in any/either order. Will it actually TRY both or
will it just use the first of the two answers it gets back?

I realize that mail server software is suppose to try the lowest MX
priority entry for a domain first and then the next highest as a
server is unavailable, but isnt this a function of the server
software? Does glusterfs do this as well?

>From what I knew, rrdns could be a "poor man's" solution for LVS as a
way to try to distrubute load, but if a server was down clients would
still be directed to them (and fail).

Corrections?