Please see the answer given on the DRBD Users list to this question. digimer On 10/29/2012 04:23 AM, Zohair Raza wrote: > Hi, > > I have setup a Primary/Primary cluster with GFS2. > > All works good if I shut down any node regularly, but when I unplug > power of any node, GFS freezes and I can not access the device. > > Tried to use http://people.redhat.com/lhh/obliterate > > this is what I see in logs > > Oct 29 08:05:41 node1 kernel: d-con res0: PingAck did not arrive in time. > Oct 29 08:05:41 node1 kernel: d-con res0: peer( Primary -> Unknown ) > conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) susp( 0 > -> 1 ) > Oct 29 08:05:41 node1 kernel: d-con res0: asender terminated > Oct 29 08:05:41 node1 kernel: d-con res0: Terminating asender thread > Oct 29 08:05:41 node1 kernel: d-con res0: Connection closed > Oct 29 08:05:41 node1 kernel: d-con res0: conn( NetworkFailure -> > Unconnected ) > Oct 29 08:05:41 node1 kernel: d-con res0: receiver terminated > Oct 29 08:05:41 node1 kernel: d-con res0: Restarting receiver thread > Oct 29 08:05:41 node1 kernel: d-con res0: receiver (re)started > Oct 29 08:05:41 node1 kernel: d-con res0: conn( Unconnected -> > WFConnection ) > Oct 29 08:05:41 node1 kernel: d-con res0: helper command: /sbin/drbdadm > fence-peer res0 > Oct 29 08:05:41 node1 fence_node[1912]: fence node2 failed > Oct 29 08:05:41 node1 kernel: d-con res0: helper command: /sbin/drbdadm > fence-peer res0 exit code 1 (0x100) > Oct 29 08:05:41 node1 kernel: d-con res0: fence-peer helper broken, > returned 1 > Oct 29 08:05:48 node1 corosync[1346]: [TOTEM ] A processor failed, > forming new configuration. > Oct 29 08:05:53 node1 corosync[1346]: [QUORUM] Members[1]: 1 > Oct 29 08:05:53 node1 corosync[1346]: [TOTEM ] A processor joined or > left the membership and a new membership was formed. > Oct 29 08:05:53 node1 corosync[1346]: [CPG ] chosen downlist: sender > r(0) ip(192.168.23.128) ; members(old:2 left:1) > Oct 29 08:05:53 node1 corosync[1346]: [MAIN ] Completed service > synchronization, ready to provide service. > Oct 29 08:05:53 node1 kernel: dlm: closing connection to node 2 > Oct 29 08:05:53 node1 fenced[1401]: fencing node node2 > Oct 29 08:05:53 node1 kernel: GFS2: fsid=cluster-setup:res0.0: jid=1: > Trying to acquire journal lock... > Oct 29 08:05:53 node1 fenced[1401]: fence node2 dev 0.0 agent > fence_ack_manual result: error from agent > Oct 29 08:05:53 node1 fenced[1401]: fence node2 failed > Oct 29 08:05:56 node1 fenced[1401]: fencing node node2 > Oct 29 08:05:56 node1 fenced[1401]: fence node2 dev 0.0 agent > fence_ack_manual result: error from agent > Oct 29 08:05:56 node1 fenced[1401]: fence node2 failed > Oct 29 08:05:59 node1 fenced[1401]: fencing node node2 > Oct 29 08:05:59 node1 fenced[1401]: fence node2 dev 0.0 agent > fence_ack_manual result: error from agent > Oct 29 08:05:59 node1 fenced[1401]: fence node2 failed > > Regards, > Zohair Raza > > > -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster