Re: pacemaker "CPG API: failed Library error"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Il 12/02/14 14:15, Jan Friesse ha scritto:
Hi Alessandro,
I was looking to log file and it looks like it is starting right after
token was lost. Do you have log BEFORE that happen.

Hi Honza

I have previous log file but something not worked correctly

[root@ga1-ext ~]# ls -alF  /var/log/cluster/corosync.log-2014021*
-rw-rw---- 1 hacluster haclient 28963 Feb 10 03:32 /var/log/cluster/corosync.log-20140210.gz -rw-rw---- 1 hacluster haclient 169899 Feb 11 03:43 /var/log/cluster/corosync.log-20140211.gz -rw-rw---- 1 hacluster haclient 35449 Feb 12 03:24 /var/log/cluster/corosync.log-20140212.gz

[root@ga1-ext ~]# zcat /var/log/cluster/corosync.log-20140211.gz |tail
Feb 10 21:47:02 [2248] ga1-ext cib: info: crm_client_new: Connecting 0x280a140 for uid=0 gid=0 pid=2169 id=308fdcf4-f47b-4afb-96f9-126601ff2573 Feb 10 21:47:02 [2248] ga1-ext cib: info: cib_process_request: Completed cib_query operation for section 'all': OK (rc=0, origin=local/crm_resource/2, version=0.309.18) Feb 10 21:47:02 [2248] ga1-ext cib: info: cib_process_request: Forwarding cib_delete operation for section constraints to master (origin=local/crm_resource/3) Feb 10 21:47:02 [2248] ga1-ext cib: info: cib_process_request: Completed cib_apply_diff operation for section constraints: OK (rc=0, origin=ga2-ext/crm_resource/3, version=0.310.1) Feb 10 21:47:02 [2249] ga1-ext stonith-ng: info: update_cib_stonith_devices: Updating device list from the cib: new location constraint Feb 10 21:47:02 [2249] ga1-ext stonith-ng: notice: unpack_config: On loss of CCM Quorum: Ignore Feb 10 21:47:02 [2248] ga1-ext cib: info: crm_client_destroy: Destroying 0 events Feb 10 21:47:02 [2248] ga1-ext cib: info: write_cib_contents: Archived previous version as /var/lib/heartbeat/crm/cib-45.raw Feb 10 21:47:02 [2248] ga1-ext cib: info: write_cib_contents: Wrote version 0.310.0 of the CIB to disk (digest: 7d257caa36077afded22a7b5b47e27e5) Feb 10 21:47:02 [2248] ga1-ext cib: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.nAfCBw (digest: /var/lib/heartbeat/crm/cib.Twax6l)

[root@ga1-ext ~]# zcat /var/log/cluster/corosync.log-20140212.gz |head
Feb 11 23:26:01 corosync [TOTEM ] The token was lost in the OPERATIONAL state. Feb 11 23:26:01 corosync [TOTEM ] A processor failed, forming new configuration. Feb 11 23:26:01 corosync [TOTEM ] Receive multicast socket recv buffer size (249856 bytes). Feb 11 23:26:01 corosync [TOTEM ] Transmit multicast socket send buffer size (249856 bytes). Feb 11 23:26:01 corosync [TOTEM ] Local receive multicast loop socket recv buffer size (249856 bytes). Feb 11 23:26:01 corosync [TOTEM ] Local transmit multicast loop socket send buffer size (249856 bytes).
Feb 11 23:26:01 corosync [TOTEM ] entering GATHER state from 2.
Feb 11 23:26:03 corosync [TOTEM ] entering GATHER state from 0.
Feb 11 23:26:03 corosync [TOTEM ] Creating commit token because I am the rep.
Feb 11 23:26:03 corosync [TOTEM ] Saving state aru 14a high seq received 14a

so strange
tonight I can try another full backup and resend you log files


Anyway, give a try to increase token timeout to value like 10. It looks
like you have 2 nodes and by default token timeout is 1 there. 10 is
used for 3 and more nodes. Also I'm unsure if this was not changed
between 6.3 and 6.4.

Just use

<totem token="X" consensus="X + 2000" />

where X is like 10000.

ok, I'll try this config change
on corosync.conf from centos 6.3 era I have
token: 3000
consensus: 5000


Regards,
   Honza

Alessandro Bono napsal(a):
Il 10/02/14 15:55, Jan Friesse ha scritto:
Ok, but I would still really like to see log from 6.5 (there were huge
amount of fixes for 6.5).
Hi Honza

I find time to force a full backup
note that I forced a full backup on another vm with lots of data, this
is enough to stress host server and cause error on cluster
attached zipped log file with centos 6.5

thank you


--
Cordiali saluti

Alessandro Bono

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss




[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux