On 5/18/2021 8:44 PM, WANG Chao wrote:
External email: Use caution opening links or attachments
On 05/18/21 at 08:30P, Mark Zhang wrote:
On 5/18/2021 5:25 PM, WANG Chao wrote:
External email: Use caution opening links or attachments
Hi All
I'm running tests from https://github.com/linux-rdma/rdma-core/tree/master and
got the following errors from all tests.test_mlx5_dc.DCTest tests:
build/bin/run_tests.py --dev mlx5_2 --port 1 tests.test_mlx5_dc.DCTest.test_dc_rdma_write
E
======================================================================
ERROR: test_dc_rdma_write (tests.test_mlx5_dc.DCTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/data/rdma-core.master/tests/test_mlx5_dc.py", line 62, in test_dc_rdma_write
send_ops_flags=e.IBV_QP_EX_WITH_RDMA_WRITE)
File "/data/rdma-core.master/tests/test_mlx5_dc.py", line 53, in create_players
self.client.pre_run(self.server.psns, self.server.qps_num)
File "/data/rdma-core.master/tests/mlx5_base.py", line 36, in pre_run
self.to_rts()
File "/data/rdma-core.master/tests/mlx5_base.py", line 31, in to_rts
self.dct_qp.to_rtr(attr)
File "qp.pyx", line 1113, in pyverbs.qp.QP.to_rtr
pyverbs.pyverbs_error.PyverbsRDMAError: Failed to modify QP state to RTR. Errno: 22, Invalid argument
----------------------------------------------------------------------
Ran 1 test in 0.051s
FAILED (errors=1)
===
Additional information:
- VF is LAG and VF binds to host.
- DC tests fail when NIC is in switchdev mode while legacy mode is fine.
- Tested on 5.12 inbox driver or OFED 5.3, neither is working.
- 5f:00.0 Ethernet controller [0200]: Mellanox Technologies MT2892 Family [ConnectX-6 Dx] [15b3:101d]
- firmware-version: 22.30.1004 (MT_0000000536)
I worked a bit tracepoint on 5.12 inbox driver. It seems like there's a firmware
command error for CREATE_DCT.
I can provide more information if you ask.
Thanks
WANG Chao
Is there any syndrome in kernel log? Try to reproduce with debug log
enabled:
echo -n "func mlx5_cmd_check +p" > /sys/kernel/debug/dynamic_debug/control
[26538.391991] mlx5_core 0000:5f:00.2: mlx5_cmd_check:820:(pid 27332): CREATE_DCT(0x710) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0xa22b82)
This syndrome indicates DCT is not supported in VF LAG mode here.