On 12/19/19 1:07 PM, Jason Gunthorpe wrote:
...
3. It would be nice if I could reproduce this. I have a two-node mlx5 Infiniband
test setup, but I have done only the tiniest bit of user space IB coding, so
if you have any test programs that aren't too hard to deal with that could
possibly hit this, or be tweaked to hit it, I'd be grateful. Keeping in mind
that I'm not an advanced IB programmer. At all. :)
Clone this:
https://github.com/linux-rdma/rdma-core.git
Install all the required deps to build it (notably cython), see the README.md
$ ./build.sh
$ build/bin/run_tests.py
If you get things that far I think Leon can get a reproduction for you
Cool, it's up and running (1 failure, 3 skipped, out of 67 tests).
This is a great test suite to have running, I'll add it to my scripts. Here's the
full output in case the failure or skip cases are a problem:
$ sudo ./build/bin/run_tests.py --verbose
test_create_ah (tests.test_addr.AHTest) ... ok
test_create_ah_roce (tests.test_addr.AHTest) ... skipped "Can't run RoCE tests on IB link layer"
test_destroy_ah (tests.test_addr.AHTest) ... ok
test_create_comp_channel (tests.test_cq.CCTest) ... ok
test_destroy_comp_channel (tests.test_cq.CCTest) ... ok
test_create_cq_ex (tests.test_cq.CQEXTest) ... ok
test_create_cq_ex_bad_flow (tests.test_cq.CQEXTest) ... ok
test_destroy_cq_ex (tests.test_cq.CQEXTest) ... ok
test_create_cq (tests.test_cq.CQTest) ... ok
test_create_cq_bad_flow (tests.test_cq.CQTest) ... ok
test_destroy_cq (tests.test_cq.CQTest) ... ok
test_rc_traffic_cq_ex (tests.test_cqex.CqExTestCase) ... ok
test_ud_traffic_cq_ex (tests.test_cqex.CqExTestCase) ... ok
test_xrc_traffic_cq_ex (tests.test_cqex.CqExTestCase) ... ok
test_create_dm (tests.test_device.DMTest) ... ok
test_create_dm_bad_flow (tests.test_device.DMTest) ... ok
test_destroy_dm (tests.test_device.DMTest) ... ok
test_destroy_dm_bad_flow (tests.test_device.DMTest) ... ok
test_dm_read (tests.test_device.DMTest) ... ok
test_dm_write (tests.test_device.DMTest) ... ok
test_dm_write_bad_flow (tests.test_device.DMTest) ... ok
test_dev_list (tests.test_device.DeviceTest) ... ok
test_open_dev (tests.test_device.DeviceTest) ... ok
test_query_device (tests.test_device.DeviceTest) ... ok
test_query_device_ex (tests.test_device.DeviceTest) ... ok
test_query_gid (tests.test_device.DeviceTest) ... ok
test_query_port (tests.test_device.DeviceTest) ... FAIL
test_query_port_bad_flow (tests.test_device.DeviceTest) ... ok
test_create_dm_mr (tests.test_mr.DMMRTest) ... ok
test_destroy_dm_mr (tests.test_mr.DMMRTest) ... ok
test_buffer (tests.test_mr.MRTest) ... ok
test_dereg_mr (tests.test_mr.MRTest) ... ok
test_dereg_mr_twice (tests.test_mr.MRTest) ... ok
test_lkey (tests.test_mr.MRTest) ... ok
test_read (tests.test_mr.MRTest) ... ok
test_reg_mr (tests.test_mr.MRTest) ... ok
test_reg_mr_bad_flags (tests.test_mr.MRTest) ... ok
test_reg_mr_bad_flow (tests.test_mr.MRTest) ... ok
test_rkey (tests.test_mr.MRTest) ... ok
test_write (tests.test_mr.MRTest) ... ok
test_dereg_mw_type1 (tests.test_mr.MWTest) ... ok
test_dereg_mw_type2 (tests.test_mr.MWTest) ... ok
test_reg_mw_type1 (tests.test_mr.MWTest) ... ok
test_reg_mw_type2 (tests.test_mr.MWTest) ... ok
test_reg_mw_wrong_type (tests.test_mr.MWTest) ... ok
test_odp_rc_traffic (tests.test_odp.OdpTestCase) ... ok
test_odp_ud_traffic (tests.test_odp.OdpTestCase) ... skipped 'ODP is not supported - ODP recv not supported'
test_odp_xrc_traffic (tests.test_odp.OdpTestCase) ... ok
test_default_allocators (tests.test_parent_domain.ParentDomainTestCase) ... ok
test_mem_align_allocators (tests.test_parent_domain.ParentDomainTestCase) ... ok
test_without_allocators (tests.test_parent_domain.ParentDomainTestCase) ... ok
test_alloc_pd (tests.test_pd.PDTest) ... ok
test_create_pd_none_ctx (tests.test_pd.PDTest) ... ok
test_dealloc_pd (tests.test_pd.PDTest) ... ok
test_destroy_pd_twice (tests.test_pd.PDTest) ... ok
test_multiple_pd_creation (tests.test_pd.PDTest) ... ok
test_create_qp_ex_no_attr (tests.test_qp.QPTest) ... ok
test_create_qp_ex_no_attr_connected (tests.test_qp.QPTest) ... ok
test_create_qp_ex_with_attr (tests.test_qp.QPTest) ... ok
test_create_qp_ex_with_attr_connected (tests.test_qp.QPTest) ... ok
test_create_qp_no_attr (tests.test_qp.QPTest) ... ok
test_create_qp_no_attr_connected (tests.test_qp.QPTest) ... ok
test_create_qp_with_attr (tests.test_qp.QPTest) ... ok
test_create_qp_with_attr_connected (tests.test_qp.QPTest) ... ok
test_modify_qp (tests.test_qp.QPTest) ... ok
test_query_qp (tests.test_qp.QPTest) ... ok
test_rdmacm_sync_traffic (tests.test_rdmacm.CMTestCase) ... skipped 'No devices with net interface'
======================================================================
FAIL: test_query_port (tests.test_device.DeviceTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/kernel_work/rdma-core/tests/test_device.py", line 129, in test_query_port
self.verify_port_attr(port_attr)
File "/kernel_work/rdma-core/tests/test_device.py", line 113, in verify_port_attr
assert 'Invalid' not in d.speed_to_str(attr.active_speed)
AssertionError
----------------------------------------------------------------------
Ran 67 tests in 10.058s
FAILED (failures=1, skipped=3)
thanks,
--
John Hubbard
NVIDIA