On 5/31/2013 7:12 AM, renayama19661014@xxxxxxxxx wrote: > Hi All, > > We discovered the problem of the network of the corosync communication. > > We composed a cluster of three nodes on KVM in corosync. > > Step 1) Start corosync service in all nodes. > > Step 2) Confirm that a cluster is comprised of all nodes definitely and became the OPERATIONAL state. > > Step 3) Cut off the network of node1(rh64-coro1) and node2(rh64-coro2) from a host of KVM. > > [root@kvm-host ~]# brctl delif virbr3 vnet5;brctl delif virbr2 vnet1 > > Step 4) Because a problem occurred, we stop all nodes. > > > The problem occurs at the time of step 3. > > One node(rh64-coro1) continues moving a state after becoming the OPERATIONAL state. > > Two nodes(rh64-coro2 and rh64-coro3) continue changing in a state. > It seems to never change in an OPERATIONAL state while the first node operates. > > This means that two nodes(rh64-coro2 and rh64-coro3) cannot complete cluster constitution. > When this network trouble happens, by the setting that corosync combined with Pacemaker, corosync cannot notify Pacemaker of the constitution change of the cluster. > > > Question 1) Are there any parameters to solve this problem in corosync.conf? > * We bundle up an interface(Bonding) and think that it can be settled by appointing "rrp_mode:none", but do not want to appoint "rrp_mode:none". > > Question 2) Is this a bug? Or is it specifications of the communication of corosync? We already checked this specific test, and it appears to be a bug in the kernel bridge code when handling multicast traffic (groups are not joined correctly and traffic is not forwarded). Check this thread as reference: http://lists.linuxfoundation.org/pipermail/openais/2013-April/016792.html Thanks Fabio _______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss