Re: Network Outage - Expected Behaviour?

Anand Avati <avati@xxxxxxxxxxx> · Tue, 14 Jul 2009 03:05:25 -0700

On Thu, Jul 9, 2009 at 8:47 AM, Gordan Bobic<gordan@xxxxxxxxxx> wrote:
> On Thu, 9 Jul 2009 08:32:12 -0700 (PDT), Martin Fick <mogulguy@xxxxxxxxx>
> wrote:
>> --- On Thu, 7/9/09, Gordan Bobic <gordan@xxxxxxxxxx> wrote:
>>> What is the expected behaviour of
>>> gluster client when servers disappear
>>> (due to a short network outage) and re-appear? For example,
>>> say there are
>>> servers A and B, and client C. A goes away (e.g. pull the
>>> network cable).
>>> Timeout occurs, client continues using B. A returns without
>>> glusterfsd
>>> being restarted or the client glusterfs being restarted.
>>>
>>> Does the client periodically re-scan for presence of
>>> servers that dropped
>>> out? Or does the client have to be restarted to notice that
>>> a server has
>>> returned?
>>
>> I would like to add the question, what happens to locks
>> when the server goes down?
>
> If I remember correctly from previous conversation:
> The primary server (first one listed) is always the lock "master". Locks
> get replicated to slaves. If one of the secondaries goes down, it doesn't
> affect anything. If the primary goes down, the lock mastering gets moved to
> the next server listed (ordering matters, and all clients must list servers
> in the same order!). Locks don't get migrated back, and if we run out of
> servers (go full circle), I think all locks status is lost. Whether this
> has changed recently (or will change soon to deal with that edge case), I
> don't know.
>
>> Are they dropped when it returns?  What if a client goes down with locks,
> do
>> they timeout?
>
> That is a good question, I don't believe I have heard an explicit answer to
> this.
>
> Note: Forwarded to list since Martin's reply didn't go to the list. I hope
> that's OK, since it seemed like a list question.

Locks held by replicate are short-lived per-transaction (per-syscall)
locks. The number of lock server is configurable and locks are held on
the first N servers for each transaction (on the record being
modified). So if one server goes down, the others are held on 2..N+1
servers. When the first server comes back up, further locks are held
again on 1..N servers. So as long as you have atleast one of the first
N servers always up you are always safe.

 By default glusterfs uses 1 server. This means that by default the
vulnerable race condition is when two clients are about to modify the
same record, the first one is granted a lock and proceeds with the
transaction and by the time the second one begins a transaction the
lock server goes down and the lock request gets granted by the second
lock server. We have chosen a default of 1 since it covers majority of
the IO patterns without an unreasonable performance hit.  If your
workload is such that a race condition like is this very likely, then
you would want to increase the lock server count in your replicate
volume configuration.

Avati