Hi Honza, Thank you for comments. > can you please tell me exact reproducer for physical hw? (because brctl > delif is I believe not valid in hw at all). It is the next environment that I reported a problem in the second in physical environment. ------------------------- Enclosure : BladeSystem c7000 Enclosure node1, node2, node3 : HP ProLiant BL460c G6(CPU:Xeon E5540,Mem:16G) --- Blade NIC:Flex-10 Embedded Ethernet x 1(2Port) NIC:NC325m Quad Port 1Gb NIC for c-Class BladeSystem(4Port) SW : GbE2c Ethernet Blade Switch x 6 ------------------------- In addition, I carried out the cutting of the interface via a switch. * In the second report, I did not execute the brctl command. Is more detailed HW information necessary? If there is necessary information, I send it. Best Regards, Hideo Yamauchi. --- On Wed, 2013/6/12, Jan Friesse <jfriesse@xxxxxxxxxx> wrote: > Hideo, > can you please tell me exact reproducer for physical hw? (because brctl > delif is I believe not valid in hw at all). > > Thanks, > Honza > > renayama19661014@xxxxxxxxx napsal(a): > > Hi Fabio, > > > > Thank you for comment. > > > >> I'll let Honza look at it, I don't have enough physical hardware to > >> reproduce. > > > > All right. > > > > Many Thanks! > > Hideo Yamauchi. > > > > > > --- On Tue, 2013/6/11, Fabio M. Di Nitto <fdinitto@xxxxxxxxxx> wrote: > > > >> Hi Yamauchi-san, > >> > >> I'll let Honza look at it, I don't have enough physical hardware to > >> reproduce. > >> > >> Fabio > >> > >> On 06/11/2013 01:15 AM, renayama19661014@xxxxxxxxx wrote: > >>> Hi Fabio, > >>> > >>> Thank you for comments. > >>> > >>> We confirmed this problem in the physical environment. > >>> The communication of corosync lets eth1,eth2 go through. > >>> > >>> ------------------------------------------------------- > >>> [root@bl460g6a ~]# ip addr show > >>> (snip) > >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000 > >>> link/ether f4:ce:46:b3:fe:3c brd ff:ff:ff:ff:ff:ff > >>> inet 192.168.101.9/24 brd 192.168.101.255 scope global eth1 > >>> inet6 fe80::f6ce:46ff:feb3:fe3c/64 scope link > >>> valid_lft forever preferred_lft forever > >>> 4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000 > >>> link/ether 18:a9:05:78:6c:f0 brd ff:ff:ff:ff:ff:ff > >>> inet 192.168.102.9/24 brd 192.168.102.255 scope global eth2 > >>> inet6 fe80::1aa9:5ff:fe78:6cf0/64 scope link > >>> valid_lft forever preferred_lft forever > >>> (snip) > >>> 8: virbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN > >>> link/ether 52:54:00:7f:f3:0a brd ff:ff:ff:ff:ff:ff > >>> inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0 > >>> 9: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 500 > >>> link/ether 52:54:00:7f:f3:0a brd ff:ff:ff:ff:ff:ff > >>> ----------------------------------------------- > >>> > >>> I think that it is not a virtual environmental problem. > >>> > >>> I attach the log that I confirmed just to make sure in three Blade.(RHEL6.4) > >>> * I performed the interception of the communication with a network switch. > >>> > >>> The phenomenon is similar, and, as for one node, a loop does an OPERATIONAL state, and two other nodes do not change in an OPERATIONAL state. > >>> > >>> After all is the problem same as the bug that you taught? > >>>> Check this thread as reference: > >>>> http://lists.linuxfoundation.org/pipermail/openais/2013-April/016792.html > >>> > >>> > >>> Best Regards, > >>> Hideo Yamauchi. > >>> > >>> > >>> > >>> --- On Fri, 2013/5/31, Fabio M. Di Nitto <fdinitto@xxxxxxxxxx> wrote: > >>> > >>>> On 5/31/2013 7:12 AM, renayama19661014@xxxxxxxxx wrote: > >>>>> Hi All, > >>>>> > >>>>> We discovered the problem of the network of the corosync communication. > >>>>> > >>>>> We composed a cluster of three nodes on KVM in corosync. > >>>>> > >>>>> Step 1) Start corosync service in all nodes. > >>>>> > >>>>> Step 2) Confirm that a cluster is comprised of all nodes definitely and became the OPERATIONAL state. > >>>>> > >>>>> Step 3) Cut off the network of node1(rh64-coro1) and node2(rh64-coro2) from a host of KVM. > >>>>> > >>>>> [root@kvm-host ~]# brctl delif virbr3 vnet5;brctl delif virbr2 vnet1 > >>>>> > >>>>> Step 4) Because a problem occurred, we stop all nodes. > >>>>> > >>>>> > >>>>> The problem occurs at the time of step 3. > >>>>> > >>>>> One node(rh64-coro1) continues moving a state after becoming the OPERATIONAL state. > >>>>> > >>>>> Two nodes(rh64-coro2 and rh64-coro3) continue changing in a state. > >>>>> It seems to never change in an OPERATIONAL state while the first node operates. > >>>>> > >>>>> This means that two nodes(rh64-coro2 and rh64-coro3) cannot complete cluster constitution. > >>>>> When this network trouble happens, by the setting that corosync combined with Pacemaker, corosync cannot notify Pacemaker of the constitution change of the cluster. > >>>>> > >>>>> > >>>>> Question 1) Are there any parameters to solve this problem in corosync.conf? > >>>>> * We bundle up an interface(Bonding) and think that it can be settled by appointing "rrp_mode:none", but do not want to appoint "rrp_mode:none". > >>>>> > >>>>> Question 2) Is this a bug? Or is it specifications of the communication of corosync? > >>>> > >>>> We already checked this specific test, and it appears to be a bug in > >>>> the kernel bridge code when handling multicast traffic (groups are not > >>>> joined correctly and traffic is not forwarded). > >>>> > >>>> Check this thread as reference: > >>>> http://lists.linuxfoundation.org/pipermail/openais/2013-April/016792.html > >>>> > >>>> Thanks > >>>> Fabio > >>>> > >>>> > >>>> _______________________________________________ > >>>> discuss mailing list > >>>> discuss@xxxxxxxxxxxx > >>>> http://lists.corosync.org/mailman/listinfo/discuss > >>>> > >> > >> > > > > _______________________________________________ > > discuss mailing list > > discuss@xxxxxxxxxxxx > > http://lists.corosync.org/mailman/listinfo/discuss > > _______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss