Hello, I have a CentOS 7.3 + updates server where my configuration arises from the need to connect via iSCSI to a Dell PS Series storage array and Dell not supporting bonding. So that I need to use 2 nics on the same vlan to connect to iSCSI portal IP and then use multipath. Also, the iSCSI lan is on a dedicated vlan I have only these 2 x 10Gbit adapters and I also need to put other vlans on them through bonding. I plan to use active-backup as bonding mode. My first tests were only with iSCSI in place to verify connection is ok I followed these Dell guidelines, for both multipath and sysctl configs for network adapters: http://en.community.dell.com/techcenter/extras/m/white_papers/20442422 iSCSI vlan is 100 so that the configured devices are p1p1.100 and p1p2.100 and my sysctl config is: net.ipv4.conf.p1p1/100.arp_announce=2 net.ipv4.conf.p1p2/100.arp_announce=2 net.ipv4.conf.p1p1/100.arp_ignore=1 net.ipv4.conf.p1p2/100.arp_ignore=1 # net.ipv4.conf.p1p1/100.rp_filter=2 net.ipv4.conf.p1p2/100.rp_filter=2 So far so good and my lun is viewed this way through multipath [root@ov300 etc]# multipath -l 364817197b5dfd0e5538d959702249b1c dm-3 EQLOGIC ,100E-00 size=1.0T features='0' hwhandler='0' wp=rw `-+- policy='round-robin 0' prio=0 status=active |- 7:0:0:0 sde 8:64 active undef running `- 8:0:0:0 sdf 8:80 active undef running I tried a test workload and both paths are used in a balanced way. Now I temporarily stop the iSCSI layer and ifdown the p1p1.100 and p1p2.100 devices and put in place bonding on plain p1p1 and p1p2 interfaces and then vlans over it and bridges (they are for VMs) [root@ov300 ~]# cat /proc/net/bonding/bond1 Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Bonding Mode: fault-tolerance (active-backup) Primary Slave: None Currently Active Slave: p1p1 MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0 Slave Interface: p1p1 MII Status: up Speed: 10000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: a0:36:9f:2e:4d:80 Slave queue ID: 0 Slave Interface: p1p2 MII Status: up Speed: 10000 Mbps Duplex: full Link Failure Count: 1 Permanent HW addr: a0:36:9f:2e:4d:82 Slave queue ID: 0 [root@ov300 ~]# - ifcfg-bond1.65 DEVICE=bond1.65 VLAN=yes BRIDGE=vlan65 ONBOOT=yes MTU=1500 DEFROUTE=no NM_CONTROLLED=no IPV6INIT=no - ifcfg-vlan65 DEVICE=vlan65 TYPE=Bridge DELAY=0 STP=off ONBOOT=yes MTU=1500 DEFROUTE=no NM_CONTROLLED=no IPV6INIT=no [root@ov300 network-scripts]# brctl show vlan65 bridge name bridge id STP enabled interfaces vlan65 8000.a0369f2e4d80 no bond1.65 [root@ov300 network-scripts]# The same for vlans 162, while for vlan 187 I only have the bond1.187 without a bridge on top of it At the end I try to re-start the iSCSI components, but it seems now I have iSCSI connection interrupted and not stable. During this phase the command [root@ov300 network-scripts]# iscsiadm -m session -P1 has in its output for the 2 interfaces Iface Netdev: p1p2.100 SID: 5 iSCSI Connection State: TRANSPORT WAIT iSCSI Session State: FAILED Internal iscsid Session State: REOPEN ... SID: 6 iSCSI Connection State: TRANSPORT WAIT iSCSI Session State: FREE Internal iscsid Session State: REOPEN and then Iface Netdev: p1p2.100 SID: 5 iSCSI Connection State: TRANSPORT WAIT iSCSI Session State: FREE Internal iscsid Session State: REOPEN ... Iface Netdev: p1p1.100 SID: 6 iSCSI Connection State: LOGGED IN iSCSI Session State: LOGGED_IN Internal iscsid Session State: NO CHANGE and from multipath point of view a sequence of these: [root@ov300 network-scripts]# multipath -l 364817197b5dfd0e5538d959702249b1c dm-2 EQLOGIC ,100E-00 size=1.0T features='0' hwhandler='0' wp=rw `-+- policy='round-robin 0' prio=0 status=active |- 14:0:0:0 sde 8:64 active undef running `- 13:0:0:0 sdf 8:80 failed faulty running [root@ov300 network-scripts]# multipath -l 364817197b5dfd0e5538d959702249b1c dm-2 EQLOGIC ,100E-00 size=1.0T features='0' hwhandler='0' wp=rw `-+- policy='round-robin 0' prio=0 status=active |- 14:0:0:0 sde 8:64 active undef running `- 13:0:0:0 sdf 8:80 failed faulty running [root@ov300 network-scripts]# multipath -l 364817197b5dfd0e5538d959702249b1c dm-2 EQLOGIC ,100E-00 size=1.0T features='0' hwhandler='0' wp=rw `-+- policy='round-robin 0' prio=0 status=active |- 14:0:0:0 sde 8:64 active undef running `- 13:0:0:0 sdf 8:80 failed undef running Should my config work or do you see any physical/network intrinsic problems in it? >From first tests it seems the more problematic adapter in iSCSI connection is the one that at the time was the active slave in the active-backup bond..... but I only did preliminary tests Tomorrow I'm going to run further tests and verify better and deeper, but any comment/suggestion would be appreciated in advance. Thanks Gianluca _______________________________________________ CentOS mailing list CentOS@xxxxxxxxxx https://lists.centos.org/mailman/listinfo/centos