Need help setting up pacemaker-cluster with drbd & mysql

Jäger, Marcus <marcus.jaeger@xxxxxxxxxxxx> · Thu, 7 Feb 2013 20:19:29 +0000

Hello there,

I’m new in setting up  pacemaker and need some help.

My config is similar to the following howto:

http://blog.non-a.net/2011/03/27/cluster_drbd

The only modification is use of mysql instead of apache2.

At first time try, everything worked fine. Both nodes came up and the ha1-node went in service.
If ha1 failed (reboot/shutdown) ha2 took all services.
If ha1 came online again it took all resources again, just as excepted.
BUT: If ha2 went offline and came online again, resource drbd didn’t came online again, cause it detected a split-brain and corosync kept starting and stopping drbd on ha2.  (Drbd had to resnyc the whole disk every time
 it failed.)

Maybe I fixed that by modifying /etc/drbd.d/global_common.conf and added “ disk { fencing resource only; } Don’t know exactly, but if ha2 disconnects and connects again no full-sync happens right now.

But at the end I’m getting confused.
Actually only ha1-node can become Master and if ha1 fails, ha2 does not take the resources and stays slave.
I’m now trying almost 2 days to figure out the problem. Google dind’t help at all.

crm_mon --one-shot –V says:

Stack: openais
Current DC: mysql-drbd-ha2 - partition with quorum
Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b
2 Nodes configured, 2 expected votes
3 Resources configured.
============

Online: [ mysql-drbd-ha1 mysql-drbd-ha2 ]

Resource Group: lvm
     datavg     (ocf::heartbeat:LVM):   Started mysql-drbd-ha1
     fs_mysql   (ocf::heartbeat:Filesystem):    Started mysql-drbd-ha1
Resource Group: mysql_grp
     app_ip     (ocf::heartbeat:IPaddr):        Started mysql-drbd-ha1
     app_mysql  (lsb:mysql):    Started mysql-drbd-ha1
Master/Slave Set: ms_drbd
     Masters: [ mysql-drbd-ha1 ]
     Slaves: [ mysql-drbd-ha2 ]

Failed actions:
    drbd:0_promote_0 (node=mysql-drbd-ha2, call=158, rc=1, status=complete): unknown error

after crm node standby of ha1-node it says

“
Node mysql-drbd-ha1: standby
Online: [ mysql-drbd-ha2 ]

Resource Group: lvm
     datavg     (ocf::heartbeat:LVM):   Started mysql-drbd-ha2
     fs_mysql   (ocf::heartbeat:Filesystem):    Started mysql-drbd-ha2
Resource Group: mysql_grp
     app_ip     (ocf::heartbeat:IPaddr):        Started mysql-drbd-ha2
     app_mysql  (lsb:mysql):    Started mysql-drbd-ha2
Master/Slave Set: ms_drbd
     Masters: [ mysql-drbd-ha2 ]
     Stopped: [ drbd:1 ]
“

After “crm node online” on ha1 it still is like:

“
Online: [ mysql-drbd-ha1 mysql-drbd-ha2 ]

Resource Group: lvm
     datavg     (ocf::heartbeat:LVM):   Started mysql-drbd-ha2
     fs_mysql   (ocf::heartbeat:Filesystem):    Started mysql-drbd-ha2
Resource Group: mysql_grp
     app_ip     (ocf::heartbeat:IPaddr):        Started mysql-drbd-ha2
     app_mysql  (lsb:mysql):    Started mysql-drbd-ha2
Master/Slave Set: ms_drbd
     Masters: [ mysql-drbd-ha2 ]
     Stopped: [ drbd:1 ]

“

Ha1 won’t become master again, unless I stop both nodes, clear the crm config and reload it.  That’s not working for productive use at all.

If you need configs and logs, please write.

Thanks in advance and greetings from Frankfurt/Main, Germany

Marcus

mcn tele.com AG Im Galluspark 17, 60326 Frankfurt

Aufsichtsrat: Uwe Ruecker (Vors.)

Vorstand: Wolfgang Gluecks, Ralf Taegener

Sitz und Registergericht: Amtsgericht Frankfurt a.M. - HRB Nr. 89717

Registerangaben: www.mcn-tele.com

Diese E-Mail enthaelt vertrauliche Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtuemlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten diese E-Mail.

This e-mail contains confidential information. If you are not the intended recipient or have received this e-mail in error, please notify the sender immediately and destroy this e-mail.

-- 
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster