Re: What is the reason which the node in which failure has not occurred carries out "lost"?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/02/14 11:40 PM, yusuke iida wrote:
Hi, all

Since a reply was not obtained in ML of Pacemaker, please let me ask a
question also here.

I measure the performance of Pacemaker in the following combinations.
Pacemaker-1.1.11.rc1
libqb-0.16.0
corosync-2.3.2

All nodes are KVM virtual machines.

  stopped the node of vm01 compulsorily from the inside, after starting 14 nodes.
"virsh destroy vm01" was used for the stop.
Then, in addition to the compulsorily stopped node, other nodes are
separated from a cluster.

The log of "Retransmit List:" is then outputted in large quantities
from corosync.

What is the reason which the node in which failure has not occurred
carries out "lost"?

Please advise, if there is a problem in a setup in something.

I attached the report when the problem occurred.
https://drive.google.com/file/d/0BwMFJItoO-fVMkFWWWlQQldsSFU/edit?usp=sharing

Regards,
Yusuke

Was the lost node fenced (stonithed) successfully? Did you chance the totem token timeouts or maximum number of allowed lost token? Was there anything interesting in the log file(s) of the remaining healthy node(s)?

--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without access to education?
_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss




[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux