On Tue, 2008-04-22 at 18:45 +0300, Harri.Paivaniemi@xxxxxxxxxxxxxxx wrote: > I agree also, > > but my problem is much more basic: to my mind this whole cluster is so badly documented, that it's > really hard to believe we have talked for years about how linux can be business-critical platform... > > >From a normal human being like myself it has taken incredible reverse-engineering just to find all pieces > of information, one piece here and one there and nothing from RH, to just understand how cluster works. > > Versions go on, things change and information just gets old just when I understand it. > > Just an example: When I first used qdisk I leared that I have to tune deadnode_timeout. When moved to ver5 > /proc/cluster got lost... so had to figure out.... ahaa its totem token now... RH support didn' know > this. This kind of frustrating things happen to me all the time. > > Information is splitted to man- pages, wiki, faq's, poor RH- manuals, different txt- files from the > deepnes of internet. I have had to use all my poor genetic power to trie to create theorys about this > cluster as an administrator. > > -hjp > > Harri, Your complaints are valid and we are aware of them within the various projects that make up the community cluster stack. We are working towards improving the documentation we produce as open source projects and our feeding of that documentation to commercial distribution vendor products like RHEL5. On a positive note, the various open source communities don't plan to make any significant user-interface-specific changes to any of the cluster stack anytime soon or for a very very long time. We have learned through experience this is very painful on our open source users, distribution vendors, various third party support, etc (the folks that add value to the software the various open source communities produce). We have made changes to our infrastructure from previous versions of the cluster stack to the latest versions for various reasons 1) reliability 2) remove all bits from kernel that are unnecessary 3) downstream adoption by third parties. I know as a user these things may not be critical to "getting the thing to just work" but over time there is significant value in having _more_ people working, supporting, distributing the code base then less. I'd ask that folks be patient with the communities. We are coordinating and working together for the first time since clusters were started on Linux, have widespread distribution, good adoption, and in general our development pace is accelerating, our user view is maturing, and our third party support from various distributions is improving. All of these things lead to downstream distributions with a better product, containing better documentation and support, then was ever available in the past. regards -steve > -----Original Message----- > From: linux-cluster-bounces@xxxxxxxxxx on behalf of Marek 'marx' Grac > Sent: Tue 4/22/2008 18:09 > To: linux clustering > Subject: Re: Fencing Driver API Requirements > > Hi, > > Jonathan Buzzard wrote: > > On Mon, 2008-04-14 at 20:47 +0200, Marek 'marx' Grac wrote: > > > > The issue is that with such a critical component of a cluster (if the > > fencing is not right bad things will happen) that in order to write a > > new fencing agent one has to start reverse engineering from source to > > work out what you need to do. > > > Those new agents with python module are available only in developer > branch are not a part of any distribution yet. There will be a > documentation soon. Supported fencing agents has their man pages are > there is description of how they work as they can use both getopt and > stdin arguments. These options does not have to have anything common, as > they are taken from the cluster.conf. Unfortunately some of the existing > fencing agents use different options, so there are no standard options > [there is an attempt to have them in new fencing agents]. > > > This is incredibly bad practice, and is bound to lead to improperly > > implemented fencing agents that then lead to bad things happening on > > clusters with these fencing agents. > > > > > I agree. > > > There a loads of potential fencing devices out there that could be > > supported, that are currently not. From my perspective trying to > > implement a fencing agent for Alert On Lan 2, it was easier to reverse > > engineer the magic packets of death using tcpdump and IDA pro as well as > > implementing a C based Linux command tool to generate them, than it has > > been to write a functioning fencing agent. > > > > It would take a couple of hours tops for someone to write a spec for > > what a fencing agent needs to do. > > > > > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster