On Thu, Sep 30, 2004 at 04:01:44PM -0700, micah nerren wrote: > Hi, > > I have a SAN with 4 file systems on it, each GFS. These are mounted > across various servers running GFS, 3 of which are lock_gulm servers. > This is on RHEL WS 3 with GFS-6.0.0-7.1 on x86_64. How many nodes? > One of the file systems simply will not mount now. The other 3 mount and > unmount fine. They are all part of the same cca. I have my master lock > server running in heavy debug mode but none of the output from > lock_gulmd tells me anything about this one bad pool. How can I figure > out what is going on, any good debug or troubleshooting steps I should > do? I think if I just reboot everything it will settle down, but we > can't do that just yet, as the master lock server happens to be on a > production box right now. 1) Are you certain that you have uniquely names all four filesystems? You can use gfs_tool to verify that there are no duplicate names. 2) Is there an expired node that is not fenced holding a lock on that filesystem? gulm_tool will help there. 3) Did you ever have all 4 filesystems mounted at the same time on the same node? i.e. did it "all of a sudden" stop working or was it always failing? > Also, is there a way to migrate a master lock server to a slave lock > server? In other words, can I force the master to become a slave and a > slave to become the new master? Restarting lock_gulmd on the master will cause one of the slaves to pick up as master and the master to come back up as a slave. Note that this only works when you have a dedicated gulm server. If you have an embedded master server (a gulm server also mounting GFS) bad things will happen when the server restarts. > > Thanks! > > -- > > Linux-cluster@xxxxxxxxxx > http://www.redhat.com/mailman/listinfo/linux-cluster -- Adam Manthei <amanthei@xxxxxxxxxx>