Possible cman init script race condition

Borgström Jonas <jobot@xxxxxxxxxx> · Mon, 24 Sep 2007 17:33:30 +0200

Hi,

I think there might be some race condition in the cman init script causing fenced to stop working correctly.
I'm able to reliably reproduce the problem using problem using a minimal cluster.conf with two nodes and fence_manual fencing.

Steps to reproduce:
1. Install cluster.conf on two nodes, enable the "cman" service and reboot both nodes.
2. The cluster boots successfully and clustat lists both nodes as online.
3. Power-cycle node prod-db1.
4. On prod-db2 openais detects the missing node but fenced decides to do nothing about it and logs nothing to /var/log/messages (But the fenced process is still running)

Output from "group_tool dump fence" After the test:

[root@prod-db2 ~]# group_tool dump fence
1190645583 our_nodeid 2 our_name prod-db2
1190645583 listen 4 member 5 groupd 7
1190645584 client 3: join default
1190645584 delay post_join 120s post_fail 0s
1190645584 added 2 nodes from ccs
1190645584 setid default 65538
1190645584 start default 1 members 2 
1190645584 do_recovery stop 0 start 1 finish 0
1190645584 node "prod-db1" not a cman member, cn 1
1190645584 add first victim prod-db1
1190645585 node "prod-db1" not a cman member, cn 1
1190645586 node "prod-db1" not a cman member, cn 1
1190645587 node "prod-db1" not a cman member, cn 1
1190645588 node "prod-db1" not a cman member, cn 1
1190645589 node "prod-db1" not a cman member, cn 1
1190645590 node "prod-db1" not a cman member, cn 1
1190645591 node "prod-db1" not a cman member, cn 1
1190645592 node "prod-db1" not a cman member, cn 1
1190645593 node "prod-db1" not a cman member, cn 1
1190645594 node "prod-db1" not a cman member, cn 1
1190645595 node "prod-db1" not a cman member, cn 1
1190645596 node "prod-db1" not a cman member, cn 1
1190645597 node "prod-db1" not a cman member, cn 1
1190645598 node "prod-db1" not a cman member, cn 1
1190645599 node "prod-db1" not a cman member, cn 1
1190645600 reduce victim prod-db1
1190645600 delay of 16s leaves 0 victims
1190645600 finish default 1
1190645600 stop default
1190645600 start default 2 members 1 2 
1190645600 do_recovery stop 1 start 2 finish 1
1190645954 client 3: dump    <--- Before killing prod-db1
1190645985 stop default
1190645985 start default 3 members 2 
1190645985 do_recovery stop 2 start 3 finish 1
1190645985 finish default 3
1190646008 client 3: dump    <--- After killing prod-db1

The reason why I suspect some kind of race condition is because I'm only able to reproduce this when cman is started on boot. If I run "service cman start" manually or add a "sleep 30" line to the init script fenced works as expected.

I'm running this test on two Dell 1955 blades on which it takes quite some time for the Linux kernel to boot and initialize all drivers. So I'm guessing this might cause the cman to be started before some crucial part of the kernel has been loaded/initialized or something.

I tried to reproduce this problem using two xen virtual machines without success, but that kernel probably boots and initializes fast enough to avoid this race condition.

The scary part is that as far as I can tell fenced is the only cman daemon being affected by this. So your cluster appears to work fine. But when a node needs to be fenced the operation it isn't carried out and that can cause gfs filesystem corruption.

Any thoughts?   

Hacking the init script doesn't feel like a very solid solution since that will be overwritten the next time the cman rpm is updated...

OS: RHEL5 Advanced platform
cman: 2.0.64-1.0.1.el5
<?xml version="1.0"?>
<cluster alias="prod-db" config_version="1" name="prod-db">
  <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="120"/>
  <clusternodes>
    <clusternode name="prod-db1" nodeid="1" votes="1">
      <fence>
        <method name="1">
          <device name="human" nodename="prod-db1"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="prod-db2" nodeid="2" votes="1">
      <fence>
        <method name="1">
          <device name="human" nodename="prod-db2"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <cman expected_votes="1" two_node="1"/>
  <fencedevices>
    <fencedevice agent="fence_manual" name="human"/>
  </fencedevices>
  <rm>
    <failoverdomains/>
    <resources/>
  </rm>
</cluster>
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster