2.0.6

skraw at ithnet.com (Stephan von Krawczynski) · Sun, 23 Aug 2009 12:17:09 +0200

On Sat, 22 Aug 2009 10:24:48 -0700
Anand Avati <avati at gluster.com> wrote:

> [... long technical explanation ...]
> As you rightly summarized,
> Your theory: glusterfs is buggy (cause) and results in all fuse
> mountpoints hanging, and also results in server2's backend fs hanging
> (effect)
> 
> My theory: your backend fs is buggy (cause) and hangs and results in
> all fuse mountpoints to hang (effect) which happens because of reasons
> explained above
> 
> I maintain that my theory is right because glusterfsd just cannot
> cause a backend filesystem to hang, and if it indeed did, the bug is
> in the backend fs because glusterfsd only performs system calls to
> access it.

Lets assume your theory is right. Then I obviously managed to create a
scenario where the bail-out decisions for servers are clearly bad. In fact
they are so bad that the whole service breaks down. This is of course a no-go
for an application thats sole (or primary) purpose is to keep your fileservice
up, no matter what servers in the backend crash or vanish. As long as there is
a theoretical way of performing the needed fileservice it should be up and
running. Even iff your theory were right, still glusterfs does not handle
the situation as good as is could (read: as a user would expect).

Can you back that analysis?

> Avati

-- 
Reagards,
Stephan