I'm wondering if there's some way for glusterfs to detect the flaws of the underlying operating system. I believe there's no bug-free file systems in the universe, so I believe it is the job of the glusterfs developer to specify which underlying filesystem is tested and supported. It's not good to simply say that glusterfs works on all real-world approximations to an imaginary bug-free posix filesystem. - Wei Stephan von Krawczynski wrote: > On Fri, 28 Aug 2009 14:28:51 +0200 > David Saez Padros <david at ols.es> wrote: > > >> Hi >> >> well, that never hapen before when using nfs with the same >> computers, same disk, etc ... for almost 2 years, so it's more >> than possible that is glusterfs the one which is triggering this >> suposed ext3 bug, but appart from this: >> > > I can assure you that you will never have an agreement on this point on this > list, this happens to be the only bugfree software in universe according to > authors ;-) > > >> a) documentation says "All operations that do not modify the file >> or directory are sent to all the subvolumes and the first successful >> reply is returned to the application", why is blocking then ? >> it's suposed that the reply from the non blocked server will >> come first and nothing will block, but clients are blocking on >> a simple ls operation >> > > My impression is that you have to imagine the setup as serialized queue on the > server. If there was one operation with a hang, all future ones will be > hanging, too. > > >> b) server1 (the non blocked one) also has the volumes mounted like >> any other client, but having option read-subvolume set to the local >> volume, but it also hangs when it was suposed to read from the local >> volume, not from the hanged one >> > > This is exactly my experience. You cannot make it work either way. There seems > to be some locking across all used servers. > > >> c) does not glsuterfs ping the servers periodically to see if they >> are available or not ? if so, why does not it detect that situation ? >> > > Well, this ping-pong procedure seems to be only detecting offline servers > (i.e. network down), but is obviously not able to give hints about being > operational or not. > > My idea of a solution would be to implement something like a bail-out timeout > configurable on the client vol file for every brick. This would allow to > intermix slow and fast servers and it would cope with a situation where some > clients are far away with a slow connnections and others are nearby with very > fast connection to the same servers. > The biggest problem about it probably is not to bail out servers, but to > re-integrate them. Currently there seems to be no userspace tool to tell a > client to re-integrate a formerly dead server. Obviously this should not > happen auto-magically to prevent flapping. > > >>>> [...] >>>> Glusterfs log only shows lines like this ones: >>>> >>>> [2009-08-28 09:19:28] E [client-protocol.c:292:call_bail] data2: bailing >>>> out frame LOOKUP(32) frame sent = 2009-08-28 08:49:18. frame-timeout = 1800 >>>> [2009-08-28 09:23:38] E [client-protocol.c:292:call_bail] data2: bailing >>>> out frame LOOKUP(32) frame sent = 2009-08-28 08:53:28. frame-timeout = 1800 >>>> >>>> Once server2 has been rebooted all gluster fs become available >>>> again on all clients and the hanged df and ls processes terminate, >>>> but difficult to understand why a replicated share that must survive >>>> to failure on one server does not. >>>> >>> You are suffering from the problem we talked about few days ago on the list. >>> If your local fs produces a deadlock somehow on one server glusterfs is >>> currently unable to cope with the situation and just _waits_ for things to >>> come. This deadlocks your clients, too, without any need. >>> Your experience backs my critics on the handling of these situations. >>> >> -- >> Best regards ... >> >> ---------------------------------------------------------------- >> David Saez Padros http://www.ols.es >> On-Line Services 2000 S.L. telf +34 902 50 29 75 >> ---------------------------------------------------------------- >> >> >> >> > > >