On 2024/5/6 20:27, Zhu Yanjun wrote:
On 06.05.24 13:33, shaozhengchao wrote:
Hi Yanjun:
Thank you for your reply. Are there any other restrictions on using
ROCE on the CX5?
https://docs.nvidia.com/networking/display/mlnxofedv571020
The above link can answer all your questions ^_^
Enjoy it.
Zhu Yanjun
Thank you.
Zhengchao Shao
Zhengchao Shao
On 2024/5/6 18:58, Zhu Yanjun wrote:
On 06.05.24 12:45, shaozhengchao wrote:
Hi yanjun:
The following is the command output after the cat /proc/net/bonding
/bond0 command is run:
If I remember it correctly, it seems that it is a rdma LAG and
bonding problem.
Not sure if it is a known problem or not. Please contact your local
support.
Zhu Yanjun
[root@localhost ~]# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v5.10.0+
Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0
802.3ad info
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: f4:1d:6b:6f:3b:97
Active Aggregator Info:
Aggregator ID: 2
Number of ports: 1
Actor Key: 23
Partner Key: 1
Partner Mac Address: 00:00:00:00:00:00
Slave Interface: enp145s0f0
MII Status: up
Speed: 40000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: f4:1d:6b:6f:3b:97
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: churned
Partner Churn State: churned
Actor Churned Count: 1
Partner Churned Count: 2
details actor lacp pdu:
system priority: 65535
system mac address: f4:1d:6b:6f:3b:97
port key: 23
port priority: 255
port number: 1
port state: 69
details partner lacp pdu:
system priority: 65535
system mac address: 00:00:00:00:00:00
oper key: 1
port priority: 255
port number: 1
port state: 1
Slave Interface: enp145s0f1
MII Status: up
Speed: 40000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: f4:1d:6b:6f:3b:98
Slave queue ID: 0
Aggregator ID: 2
Actor Churn State: none
Partner Churn State: churned
Actor Churned Count: 0
Partner Churned Count: 1
details actor lacp pdu:
system priority: 65535
system mac address: f4:1d:6b:6f:3b:97
port key: 23
port priority: 255
port number: 2
port state: 77
details partner lacp pdu:
system priority: 65535
system mac address: 00:00:00:00:00:00
oper key: 1
port priority: 255
port number: 1
port state: 1
Thank you
Zhengchao Shao
On 2024/5/6 16:26, Zhu Yanjun wrote:
On 06.05.24 06:46, shaozhengchao wrote:
When using the 5.10 kernel, I can find two IB devices using the
ibv_devinfo command.
----------------------------------
[root@localhost ~]# lspci
91:00.0 Ethernet controller: Mellanox Technologies MT27800 Family
[ConnectX-5]
91:00.1 Ethernet controller: Mellanox Technologies MT27800 Family
----------------------------------
[root@localhost ~]# ibv_devinfo
hca_id: mlx5_0
transport: InfiniBand (0)
fw_ver: 16.31.1014
node_guid: f41d:6b03:006f:4743
sys_image_guid: f41d:6b03:006f:4743
vendor_id: 0x02c9
vendor_part_id: 4119
hw_ver: 0x0
board_id: HUA0000000004
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
hca_id: mlx5_1
transport: InfiniBand (0)
fw_ver: 16.31.1014
node_guid: f41d:6b03:006f:4744
sys_image_guid: f41d:6b03:006f:4743
vendor_id: 0x02c9
vendor_part_id: 4119
hw_ver: 0x0
board_id: HUA0000000004
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
----------------------------------
But after the two network ports are bonded, only one IB device is
available, and only PF0 can be used.
[root@localhost shaozhengchao]# ibv_devinfo
hca_id: mlx5_bond_0
transport: InfiniBand (0)
fw_ver: 16.31.1014
node_guid: f41d:6b03:006f:4743
sys_image_guid: f41d:6b03:006f:4743
vendor_id: 0x02c9
vendor_part_id: 4119
hw_ver: 0x0
board_id: HUA0000000004
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
The current Linux mainline driver is the same.
I found the comment ("If bonded, we do not add an IB device for
PF1.")
in the mlx5_lag_intf_add function of the 5.10 branch driver code.
Not sure if rdma lag is enabled for this or not. /proc/net/bonding
will provide more more details normally.
Zhu Yanjun
This indicates that wthe the same NIC is used, only PF0 support
bonding?
Are there any other constraints, when enable bonding with CX5?
Thank you
Zhengchao Shao