Re: GFS as a Resource

"Brett Cave" <brettcave@xxxxxxxxx> · Fri, 15 Aug 2008 23:21:23 +0200

On Fri, Aug 15, 2008 at 8:05 PM, Maurizio Rottin
<maurizio.rottin@xxxxxxxxx> wrote:
> 2008/8/15 Chris Edwards <cedwards@xxxxxxxxxxxxxxxx>:
>> Whoops, scratch that last post.   I now have it working by leaving the entry
>> in fstab without the noauto and turning GFS off with chkconfig and allowing
>> the cluster service to turn it on.
>> Thanks again!
>
> i believe thats the wrong way.
> I know it works in that way, but:
> - if you have only one node, do not use gfs, it's slow!
> - if you have more than one node, use it -- and if you can, test gfs2
> as weel (it should be more and more fast) -- but do not mount it (only
> - i mean, you don't need it to be listed on a fstab) in fstab.

Not sure what the first "more" is referring to above - stable perhaps?

After a week of gfs2, i reverted back to gfs1 - found it to be more
stable - gfs2 is still experimental. Then again, I am still getting
buggy behaviour from gfs1.

On that note - anyone have any ideas as to why a node trying to mount
gfs after a hard reset has the following connection order (as per
logs) - gfs1.

dlm: connecting to 3
dlm: got connection from 4
dlm: connecting to 4
dlm: got connection from 4

GFS system hang at this point

or connecting to 3, got connection from 3, connecting to 3, got
connection from 4.

This happens quite often, and i have to restart all nodes to get gfs
back up... ideas and suggestions are much appreciated.

> gfs works if only all the nodes are "up and running", which means, if
> one node can't be reached, but is up (network or other problems
> inolved) no one will use the gfs filesystem.
> You must use it as a resorce, and you must have at least one fencing
> method for each node in the cluster.
> In this way, once a node becomes unreachable, it will be fenced and
> the other nodes can write happily on the filesystem. This is because
> if one node "can be considered up and maybe running" it may be writing
> on the filesystem, or it can maybe think that it is the only one node
> in the cluster (think ebout switch problem, or arp spoofing) than if
> you try a "clust" command on that node you will see al  the other
> nodes down and only that one up....this is why you must have  a
> fencing method! that node HAS TO be shut down or reloaded, otherwise
> the filesystem will be blocked, and no read o write can be issued by
> any of the nodes in the cluster".
>
> I am not talking about what it is in theory(never attended a RH
> session), but believe me, in practice it works like that!
>
> create a global resource (and always create a global resource even if
> it is a fencing, or a vsftpd resource that every node has in common)
> aqnd mount it in every node you need as a service. Do not think an
> fstab entry is the better thing you can have, it is not, it can lock
> you filesystem till all the nodes are really working and talking one
> each other.
>
> --
> mr
>
> --
> Linux-cluster mailing list
> Linux-cluster@xxxxxxxxxx
> https://www.redhat.com/mailman/listinfo/linux-cluster
>

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster