On Wed, 09 Sep 2009 19:43:15 -0400 Mark Mielke <mark at mark.mielke.cc> wrote: > > > > On Wed, 9 Sep 2009 23:17:07 +0530 > > Anand Avati<avati at gluster.com> wrote: > > > > > >> Please reply back to this thread only after you have a response from > >> the appropriate kernel developer indicating that the cause of this > >> lockup is because of a misbehaving userspace application. After that, > >> let us give you the benefit of doubt that the misbehaving userspace > >> process is glusterfsd and then continue any further debugging. It is > >> not that we do not want to help you, but we really are pointing you to > >> the right place where your problem can actually get fixed. You have > >> all the necessary input they need. > >> > > This is the kind of statement that often drives listeners to think about a > > project fork... > > > > > > Only if backed up. Has the trace been shown to the linux developers? > What do they think? > > If the linux developers come back with "this is totally a userspace > program - go away", then yes, it can lead to people thinking about a > project fork. But, if the linux developers come back with "crap - yes, > this is a kernel program", then I think you might owe Anand an apology > for pushing him... :-) > > In this case, there is too many unknowns - but I agree with Anand's > logic 100%. Gluster should not be able to cause a CPU lock up. It should > be impossible. If it is not impossible - it means a kernel bug, and the > best place to have this addressed is the kernel devel list, or, if you > have purchased a subscription from a company such as RedHat, than this > belongs as a ticket open with RedHat. You know, I am really bothered about the way the maintainers are acting since I read this list. There is really a lot of ideology going on ("can't be", "is impossible for userspace" etc) and very few real debugging. This application is not the only one in the world. People use heavily file- and net-acting applications like firefox, apache, shell-scripts, name-one on their boxes. None leads to effects seen if you play with glusterfs. If you really think it is a logical way of debugging to go out and simply tell "userspace can't do that" while the rest of the application-world does not show up with dead-ends like seen on this list, how can I change your mind? I hardly believe I can. I can only tell you what I would do: I would try to document _first_ that my piece of code really does behave well. But as you may have noticed there is no real way to provide this information. And that is indeed part of the problem. Wouldn't it be a nice step if you could debug the ongoings of a glusterfs-server on the client by simply reading an exported file (something like a server-dependant meta-debug-file) that outputs something like strace does? Something that enables you to say: "Ok, here you can see what the application did, and there you can see what the kernel made of it". As we noticed a server-logfile is not sufficient. Is ideology really a prove for anything in todays' world? Do you really think it is possible to understand the complete world by seeing half of it and the other half painted by ideology? What is wrong about _proving_ being not guilty? About acting defensive ? It is important to understand that this application is a kind of core technology for data storage. This means people want to be sure that their setup does not explode just because they made a kernel update or some other change where their experience tells them it should have no influence on the glusterfs service. You want to be sure, just like you are when using nfs. It just does work (even being in kernel-space!). Now, answer for yourself if you think glusterfs is as stable as nfs on the same box. > Cheers, > mark -- Regards, Stephan