On Fri, Aug 15, 2008 at 8:05 PM, Maurizio Rottin <maurizio.rottin@xxxxxxxxx> wrote: > 2008/8/15 Chris Edwards <cedwards@xxxxxxxxxxxxxxxx>: >> Whoops, scratch that last post. I now have it working by leaving the entry >> in fstab without the noauto and turning GFS off with chkconfig and allowing >> the cluster service to turn it on. >> Thanks again! > > i believe thats the wrong way. > I know it works in that way, but: > - if you have only one node, do not use gfs, it's slow! > - if you have more than one node, use it -- and if you can, test gfs2 > as weel (it should be more and more fast) -- but do not mount it (only > - i mean, you don't need it to be listed on a fstab) in fstab. Not sure what the first "more" is referring to above - stable perhaps? After a week of gfs2, i reverted back to gfs1 - found it to be more stable - gfs2 is still experimental. Then again, I am still getting buggy behaviour from gfs1. On that note - anyone have any ideas as to why a node trying to mount gfs after a hard reset has the following connection order (as per logs) - gfs1. dlm: connecting to 3 dlm: got connection from 4 dlm: connecting to 4 dlm: got connection from 4 GFS system hang at this point or connecting to 3, got connection from 3, connecting to 3, got connection from 4. This happens quite often, and i have to restart all nodes to get gfs back up... ideas and suggestions are much appreciated. > gfs works if only all the nodes are "up and running", which means, if > one node can't be reached, but is up (network or other problems > inolved) no one will use the gfs filesystem. > You must use it as a resorce, and you must have at least one fencing > method for each node in the cluster. > In this way, once a node becomes unreachable, it will be fenced and > the other nodes can write happily on the filesystem. This is because > if one node "can be considered up and maybe running" it may be writing > on the filesystem, or it can maybe think that it is the only one node > in the cluster (think ebout switch problem, or arp spoofing) than if > you try a "clust" command on that node you will see al the other > nodes down and only that one up....this is why you must have a > fencing method! that node HAS TO be shut down or reloaded, otherwise > the filesystem will be blocked, and no read o write can be issued by > any of the nodes in the cluster". > > I am not talking about what it is in theory(never attended a RH > session), but believe me, in practice it works like that! > > create a global resource (and always create a global resource even if > it is a fencing, or a vsftpd resource that every node has in common) > aqnd mount it in every node you need as a service. Do not think an > fstab entry is the better thing you can have, it is not, it can lock > you filesystem till all the nodes are really working and talking one > each other. > > -- > mr > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster