Re: Fwd: GFS volume hangs on 3 nodes after gfs_grow

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Again thanks for the fast and prompt response Bob.

I restored nodes to the healthy state and they can access GFS volumes.
node3:
service gfs status
Configured GFS mountpoints:
/lvm_test1
/lvm_test2
Active GFS mountpoints:
/lvm_test1
/lvm_test2

node4:
service gfs status
Configured GFS mountpoints:
/lvm_test1
/lvm_test2
Active GFS mountpoints:
/lvm_test1
/lvm_test2

node2 - lucy node:
service gfs status
Configured GFS mountpoints:
/lvm_test1
/lvm_test2
Active GFS mountpoints:
/lvm_test1
/lvm_test2


I will try to reproduce the problem with gfs_grow.

One more question regarding GFS - what steps would you recommend (if any) for growing and shrinking active GFS volume?

On Fri, Sep 26, 2008 at 12:44 PM, Bob Peterson <rpeterso@xxxxxxxxxx> wrote:
----- "Alan A" <alan.zg@xxxxxxxxx> wrote:
| Thanks again, Bob.
|
| No kernel-panic on any of the nodes. I had to cold boot all 3 nodes in
| order
| to get the cluster going (might have been a fence issue but am not
| 100%
| sure, since we use only SCSI fencing until we agree on secondary
| fencing
| method). What is 'scary' is that gfs_grow command paralized that
| volume on
| all 3 nodes, and I coldn't access, nor unmount, nor run gfs_fsck, from
| any
| of the nodes. We will do more testing on this, btw do you have
| suggested
| "safe" method of growing and shrinking the volume other than what is
| noted
| in 5.2 documentation (since we followed the RHEL manual). If the GFS
| volume
| hangs - what is the best way to try and unmount it from the node,
| would
| 'gfs_freeze' helped)?

Hi Alan,

No, gfs_freeze won't help.  In these cases, it's probably best to
reboot the node that caused the problem, by /sbin/reboot -fin or
throwing the power switch I think.  I suspect that clvmd status
hung because of the earlier problem.

I'm not aware of any problems in your version of gfs_grow that can
cause this kind of lockup.  It's designed to be run seamlessly while
other processes are using the file system, and that's the kind of
thing we test regularly.

If you figure out how to recreate the lockup, let me know so I
can try it out.  Of course, if this is a production cluster, I
would not take it out of production a long time to try this.
But if I can recreate the problem here, I'll file a bugzilla
record and get it fixed.

Regards,

Bob Peterson
Red Hat Clustering & GFS

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster



--
Alan A.
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux