Hi, I have recently had a couple of situations with my cluster where both nodes were restarted simultaneously. The reasons for this are a bit beyond me so I was wondering if anyone could clarify / point me to relevant documentation. Following excerpts from both nodes logs : Oct 2 08:32:22 node1 qdiskd[3758]: <info> Heuristic: 'ping 10.X.X.X -c1 -t2' DOWN (3/3) Oct 2 08:32:39 node1 qdiskd[3758]: <info> Heuristic: 'ping X.X.X.X -c1 -t2' DOWN (6/6) Oct 2 08:32:55 node1 qdiskd[3758]: <info> Heuristic: 'ping X.X.X.X -c1 -t2' DOWN (6/6) Oct 2 08:32:58 node1 qdiskd[3758]: <info> Heuristic: 'ping X.X.X.X -c1 -t1' DOWN (6/6) Oct 2 08:33:01 node1 qdiskd[3758]: <notice> Score insufficient for master operation (0/4; required=1); downgrading Oct 2 08:33:01 node1 kernel: md: stopping all md devices. Oct 2 08:32:23 node2 qdiskd[3599]: <info> Heuristic: 'ping 10.X.X.X -c1 -t2' DOWN (3/3) Oct 2 08:32:49 node2 qdiskd[3599]: <info> Heuristic: 'ping X.X.X.X -c1 -t2' DOWN (6/6) Oct 2 08:32:56 node2 qdiskd[3599]: <info> Heuristic: 'ping X.X.X.X -c1 -t1' DOWN (6/6) Oct 2 08:32:56 node2 qdiskd[3599]: <info> Heuristic: 'ping X.X.X.X -c1 -t2' DOWN (6/6) Oct 2 08:33:03 node2 qdiskd[3599]: <notice> Score insufficient for master operation (0/4; required=1); downgrading Oct 2 08:33:03 node2 kernel: md: stopping all md devices. Does qdisk reboot the node due to these tests failing? The upstream routers these nodes are connected to were unavailable for at most 2 minutes, and all four pingtests require connectivity through the router (probably need to change that!?). What kind of tests can I use for qdiskd that will prevent router-outages from killing my cluster completely? Regards -- Denis -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster