Re: cluster log files (was: Re: pacemaker "CPG API: failed Library error")

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Alessandro

Thanks for those. I can see that CMAN was told by some external program to send that KILL to node 2 from this line here:

  daemon: client command is 8000000c

It's not clear what might be asking cman to do this. A common culprit is qdisk, but I can't see any qdisk in your cluster.conf or log files so it must be something else. I'm don't think pacemaker does anything like that.

I can see you have pacemaker-managed drbd on the system so that might be worth investigating - I can't see any actual drbd messages in those log files so they might be elsewhere.

Sorry, I can be more help, but this seems to be cause (at least nor directly) but either corosync or cman :/

Chrissie


On 24/02/14 13:10, Alessandro Bono wrote:
Hi Christine

attached log file from two nodes (too big for ml)

[root@ga1-ext ~]# zgrep -i kill /var/log/cluster/corosync.log-20140223.gz
Feb 22 22:51:39 corosync [CMAN  ] memb: Sending KILL to node 2

no gfs, dlm or qdisk in this cluster
configuration infomation below

<cluster config_version="8" name="ga-ext_cluster">
<cman two_node="1" expected_votes="1"/>
<totem token="3000" consensus="5000" />
   <logging>
    <logging_daemon name="corosync" debug="on"/>
   </logging>
   <clusternodes>
     <clusternode name="ga1-ext" nodeid="1">
       <fence>
         <method name="pcmk-redirect">
           <device name="pcmk" port="ga1-ext"/>
         </method>
       </fence>
     </clusternode>
     <clusternode name="ga2-ext" nodeid="2">
       <fence>
         <method name="pcmk-redirect">
           <device name="pcmk" port="ga2-ext"/>
         </method>
       </fence>
     </clusternode>
   </clusternodes>
   <fencedevices>
     <fencedevice agent="fence_pcmk" name="pcmk"/>
   </fencedevices>
</cluster>

pacemaker configuration

  crm configure show
node ga1-ext \
     attributes standby="off"
node ga2-ext \
     attributes standby="off"
primitive ClusterIP ocf:heartbeat:IPaddr \
     params ip="10.12.23.3" cidr_netmask="24" \
     op monitor interval="30s"
primitive SharedFS ocf:heartbeat:Filesystem \
     params device="/dev/drbd/by-res/r0" directory="/shared"
fstype="ext4" options="noatime,nobarrier"
primitive dovecot lsb:dovecot
primitive drbd0 ocf:linbit:drbd \
     params drbd_resource="r0" \
     op monitor interval="15s"
primitive drbdlinks ocf:tummy:drbdlinks
primitive mail ocf:heartbeat:MailTo \
     params email="root@xxxxxxxxxxxxxxxxxxxx" subject="ga-ext cluster - "
primitive mysql lsb:mysqld
group service_group SharedFS drbdlinks ClusterIP mail mysql dovecot \
     meta target-role="Started"
ms ms_drbd0 drbd0 \
     meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true"
colocation service_on_drbd inf: service_group ms_drbd0:Master
order service_after_drbd inf: ms_drbd0:promote service_group:start
property $id="cib-bootstrap-options" \
     dc-version="1.1.10-14.el6_5.2-368c726" \
     cluster-infrastructure="cman" \
     expected-quorum-votes="2" \
     stonith-enabled="false" \
     no-quorum-policy="ignore" \
     last-lrm-refresh="1392499995" \
     maintenance-mode="false"
rsc_defaults $id="rsc-options" \
     resource-stickiness="100"



_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss




[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux