GFS volume locks during cluster node join/leave

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello again,

We have a 3-node RHCS cluster with a shared GFS volume that's performing quite well after some tuning, so I couldn't be happier.

However, whenever a node leaves the cluster (be it in a 'nice' way by rebooting, or after being fenced) our GFS volume is unusable for at least 30 seconds. Even an 'ls' on the volume blocks during this period. During this period I see no activity in the /var/log/messages of the other nodes. The only message is that one node is leaving the cluster. After 30 seconds the cluster starts reconfiguring.

When I fence a node the same thing happens. It takes about 30 seconds before the other nodes try to reclaim the journal of the lost node, which in itself takes over a minute. Once the missing node rejoins after a reboot, the GFS is again unavaiable for a long period.

Is this expected behaviour? Is there anything we can do to reduce these delays? We run 10 VMs on our active nodes.. it's a shame to have these all lock up because we're rebooting a passive node :)

Thanks!

Cheers,
Martijn Storck


--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux