Re: IPoIB child interfaces not working with mlx5

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 22, 2021 at 7:56 AM Leon Romanovsky <leon@xxxxxxxxxx> wrote:
>
> On Mon, Mar 22, 2021 at 07:08:01AM +0100, Jinpu Wang wrote:
> > On Sun, Mar 21, 2021 at 2:07 PM Leon Romanovsky <leon@xxxxxxxxxx> wrote:
> > >
> > > On Sat, Mar 20, 2021 at 02:09:50PM +0100, Jack Wang wrote:
> > > > Leon Romanovsky <leon@xxxxxxxxxx>于2021年3月20日 周六12:17写道:
> > > >
> > > > > On Fri, Mar 19, 2021 at 08:44:29AM +0100, Jinpu Wang wrote:
> > > > > > Hi Jason and Leon,
> > > > > >
> > > > > > We recently switch to use upstream OFED from MLNX-OFED, and we notice
> > > > > > IPoIB stop working with upstream kernel 5.4.102 with mellanox CX-5
> > > > > > HCA, it's working fine on CX-2/CX-3. I tested also on 5.11 kernel it
> > > > > > behaves the same.
> > > > >
> > > > > Are you using "enhanced IPoIB" for CX-5 devices? MLX5_CORE_IPOIB?
> > > > >
> > > > > Thanks
> > > >
> > > >  Yes.
> > >
> > > > Is this expected behavor?
> > >
> > > Yes, we wanted to make IPoIB behave like any other netdev interfaces and
> > > if parent interface isn't enabled, no traffic should pass. More on that,
> > > in our internal implementation of enhanced IPoIB, we are reusing same
> > > resources for both parent and child, this requires us to wait for "UP"
> > > event before allowing traffic.
> > >
> > > Thanks
> > Hi Leon,
> >
> > Thanks for the clarification, is this behavior documented somewhere?
> > is it specific to "enhanced IPoIB" for CX-5?
>
> It is specific to "enhanced IPoIB" and not to device. I don't know where
> we can document it.
>
> > Will it work differently if without MLX5_CORE_IPOIB enabled?
>
> Yes, without MLX5_CORE_IPOIB, the devices will work in "legacy IPoIB",
> exactly as cx-3. The best thing will be to change IPoIB ULP to behave
> like netdev, but we were not comfortable to do it back then due to
> user visible nature of such change.
>
Hi Leon,

More testing reveals new problems with MLX5_CORE_IPOIB.
w MLX5_CORE_IPOIB, ping wors on both hosts, but iperf3 doens't send any data.
I'm running on A: "iperf3 -s"
and on B: "sudo iperf3 -t 30000 -c ip6_of_A"
example output

[  5] local 2a02:247f:401:1:2:0:a:391 port 41288 connected to
2a02:247f:401:1:2:0:a:392 port 5201

[ ID] Interval           Transfer     Bitrate         Retr  Cwnd

[  5]   0.00-1.00   sec   165 KBytes  1.35 Mbits/sec    2   3.93 KBytes

[  5]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec    1   3.93 KBytes


[  5]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    1   3.93 KBytes

While when I disable MLX5_CORE_IPOIB, run the same test above, iperf
run without problem.

[  5] local 2a02:247f:401:1:2:0:a:391 port 51866 connected to
2a02:247f:401:1:2:0:a:392 port 5201

[ ID] Interval           Transfer     Bitrate         Retr  Cwnd

[  5]   0.00-1.00   sec   293 MBytes  2.46 Gbits/sec    0   1.50 MBytes

[  5]   1.00-2.00   sec   290 MBytes  2.43 Gbits/sec    0   1.50 MBytes

[  5]   2.00-3.00   sec   289 MBytes  2.42 Gbits/sec    0   1.50 MBytes

[  5]   3.00-4.00   sec   290 MBytes  2.43 Gbits/sec    0   1.50 MBytes

On both side we have:
jwang@xxxxxxxxxxxxxx:/mnt/jwang$ ibstat
CA 'mlx5_0'
CA type: MT4119
Number of ports: 1
Firmware version: 16.27.2008
Hardware version: 0
Node GUID: 0x98039b03006c7912
System image GUID: 0x98039b03006c7912
Port 1:
State: Active
Physical state: LinkUp
Rate: 40
Base lid: 14
LMC: 0
SM lid: 19
Capability mask: 0x2651e848
Port GUID: 0x98039b03006c7912
Link layer: InfiniBand
CA 'mlx5_1'
CA type: MT4119
Number of ports: 1
Firmware version: 16.27.2008
Hardware version: 0
Node GUID: 0x98039b03006c7913
System image GUID: 0x98039b03006c7912
Port 1:
State: Active
Physical state: LinkUp
Rate: 40
Base lid: 15
LMC: 0
SM lid: 45
Capability mask: 0x2651e848
Port GUID: 0x98039b03006c7913
Link layer: InfiniBand

The initial tests were done on 5.4.102.
And I did a brief test with ~linux-5.12-rc4 with MLX5_CORE_IPOIB,
iperf3 also doesn't work as same as 5.4.102.

cat /etc/network/interfaces.d/infiniband
auto ib0.beef
iface ib0.beef inet static
    address 10.42.3.145
    netmask 20
    up sysctl -w net.ipv4.conf.ib0/beef.forwarding=1
    up ethtool -K $IFACE gro off
    pre-up ip link set ib0 up
    dad-attempts 600

auto ib0.dddd
iface ib0.dddd inet6 static
    address 2a02:247f:401:1:2:0:a:391
    netmask 64
    pre-up ip link set ib0 up
    up sysctl -w net.ipv6.conf.ib0/dddd.forwarding=1
net.ipv6.conf.ib0/dddd.proxy_ndp=1
    up ip -6 route add fd57:1:0:4::/64 dev $IFACE
    up ethtool -K $IFACE gro off
    dad-attempts 600

auto ib1.beef
iface ib1.beef inet static
    address 10.43.3.145
    netmask 20
    up sysctl -w net.ipv4.conf.ib1/beef.forwarding=1
    up ethtool -K $IFACE gro off
    pre-up ip link set ib1 up
    dad-attempts 600

auto ib1.dddd
iface ib1.dddd inet6 static
    address 2a02:247f:402:1:2:0:a:391
    netmask 64
    pre-up ip link set ib1 up
    up sysctl -w net.ipv6.conf.ib1/dddd.forwarding=1
net.ipv6.conf.ib1/dddd.proxy_ndp=1
    up ip -6 route add fd57:2:0:4::/64 dev $IFACE
    up ethtool -K $IFACE gro off
    dad-attempts 600

Thanks!




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux