Rebooting qdisk master causes quorum to dissolve.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I have a five node cluster with a shared quorum disk without heuristics.
Because of the a hardware problem I need to move the services off the
host in question and replace some ram. The services moved without a
hitch, but soon as I rebooted the nodes the cluster came down.

The relevant configuration is 

<cluster alias="Services" config_version="150" name="Services">
        <quorumd interval="5" tko="12" device="/dev/emcpowere" votes="3"
log_level="9" log_facility="local4" status_file="/qdisk_status"/>
        <fence_daemon clean_start="1" post_fail_delay="15"
post_join_delay="30"/>
        <cman deadnode_timeout="90" expected_nodes="4"/> 

The relevant logs are below from an adjacent node:

Dec 21 11:40:15 io2 clurgmgrd[7271]: <notice> Member 1 shutting down 
Dec 21 11:40:40 io2 qdiskd[6820]: <info> Node 1 shutdown 
Dec 21 11:40:47 io2 openais[6801]: [CMAN ] lost contact with quorum
device 
Dec 21 11:40:47 io2 openais[6801]: [CMAN ] quorum lost, blocking
activity 
Dec 21 11:40:47 io2 clurgmgrd[7271]: <emerg> #1: Quorum Dissolved 
Dec 21 11:40:47 io2 kernel: dlm: closing connection to node 1

Have I configured this in-correctly or is the a known problem with
rebooting the qdisk master? It's just occurred to me that I did lock the
resource groups to prevent the moved services from returning to the
node.

Thanks in-advance and look forward to your replies, 

Peter Tiggerdine
HPC & eResearch Specialist
High Performance Computing Group
Information Technology Services
University of Queensland


--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux