On Sun, 30 Aug 2009 01:00:13 -0700 Anand Avati <avati at gluster.com> wrote: > > I'm wondering if there's some way for glusterfs to detect the flaws of the > > underlying operating system. ?I believe there's no bug-free file systems in > > the universe, so I believe it is the job of the glusterfs developer to > > specify which underlying filesystem is tested and supported. ?It's not good > > to simply say that glusterfs works on all real-world approximations to an > > imaginary bug-free posix ?filesystem. > > I would be genuinely interested to know about another project which is > geared up to be resilient against kernel hangs so that we can borrow > some ideas on how to reliably detect kernel soft lockups or syscall > hangs. As far as I know, even mature projects like Apache have not > bothered fixing such hangs (or even detecting this kind of underlying > OS flaw). Apache is no software thats' primary use is to overcome hardware (and software) issues leading to offline filesystems. You cannot compare two applications with totally different usage patterns. And, just to say that clearly, nobody expects you to _solve_ or fix a hang. The users only expect to _recognise_ a problem and just shut down. It is far better to shut down without a real problem than to continue while having one and hang. First one leads to more work at max, but second one leads to offline service. And thats exactly why we are all here, to prevent an offline file service. > Avati -- Regards, Stephan