Re: Halt nodes in cluster with cable disconnect

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Digimer i use your manual ;) 

https://alteeve.com/w/Red_Hat_Cluster_Service_2_Tutorial

in a test environment y desactivate drbd daemon for testing but with or without drbd daemon running, the problem persist
I use the next handler and fencing policy in drbd

fencing resource-and-stonith;
outdate-peer "/sbin/obliterate-peer.sh";

Digimer when you suggest add "sleep 10"' is in drbd.conf?

On Tue, Jan 24, 2012 at 4:09 PM, Digimer <linux@xxxxxxxxxxx> wrote:
On 01/24/2012 03:57 PM, Miguel Angel Guerrero wrote:
> Hi i'm trying to setup a centos cluster with two nodes with cman, drbd,
> gfs2 and i'm using ipmi for fencing. DRBD is set up between the nodes
> using a dedicated interface. So, when I unplug the drbd network cable,
> both nodes power off immediatly (i tried using crossover cable and both
> nodes connected to a switch, but both scenarios fail), and the logs
> doesn't seem to show something useful. In a previous thread on this
> list, it is recommended to deactivate ACPID daemon, even at BIOS level,
> but I'm still having troubles.
>
> If I simulate a physical disconnection with ifdown command in some node,
> this node reboots with no hassle, but unpluging the cable kills both
> nodes. I think the first scenario is correct, but the second one is not
> what I expect.
>
> Thanks for your help the next are my cluster.conf

This is likely caused by both nodes getting their fence calls off before
one of them dies.

How do you have DRBD configured? Specifically, what fence handler are
you using? If you're interested in testing, I have rewritten lon's
obliterate-peer.sh and added explicit delays to help resolve this exact
issue.

https://github.com/digimer/rhcs_fence

Alternatively, add a 'sleep 10' or similar to one of your existing fence
handlers and you should find that the node with the delay consistently
loses while the other node remains up.

--
Digimer
E-Mail:              digimer@xxxxxxxxxxx
Papers and Projects: https://alteeve.com



--
Atte:
------------------------------------
Miguel Angel Guerrero
Usuario GNU/Linux Registrado #353531
------------------------------------
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux