On Thu, Dec 19, 2019 at 02:58:43PM -0800, John Hubbard wrote: > On 12/19/19 1:07 PM, Jason Gunthorpe wrote: > ... > > > 3. It would be nice if I could reproduce this. I have a two-node mlx5 Infiniband > > > test setup, but I have done only the tiniest bit of user space IB coding, so > > > if you have any test programs that aren't too hard to deal with that could > > > possibly hit this, or be tweaked to hit it, I'd be grateful. Keeping in mind > > > that I'm not an advanced IB programmer. At all. :) > > > > Clone this: > > > > https://github.com/linux-rdma/rdma-core.git > > > > Install all the required deps to build it (notably cython), see the README.md > > > > $ ./build.sh > > $ build/bin/run_tests.py > > > > If you get things that far I think Leon can get a reproduction for you > > > > Cool, it's up and running (1 failure, 3 skipped, out of 67 tests). > > This is a great test suite to have running, I'll add it to my scripts. Here's the > full output in case the failure or skip cases are a problem: > > $ sudo ./build/bin/run_tests.py --verbose > > test_create_ah (tests.test_addr.AHTest) ... ok > test_create_ah_roce (tests.test_addr.AHTest) ... skipped "Can't run RoCE tests on IB link layer" > test_destroy_ah (tests.test_addr.AHTest) ... ok > test_create_comp_channel (tests.test_cq.CCTest) ... ok > test_destroy_comp_channel (tests.test_cq.CCTest) ... ok > test_create_cq_ex (tests.test_cq.CQEXTest) ... ok > test_create_cq_ex_bad_flow (tests.test_cq.CQEXTest) ... ok > test_destroy_cq_ex (tests.test_cq.CQEXTest) ... ok > test_create_cq (tests.test_cq.CQTest) ... ok > test_create_cq_bad_flow (tests.test_cq.CQTest) ... ok > test_destroy_cq (tests.test_cq.CQTest) ... ok > test_rc_traffic_cq_ex (tests.test_cqex.CqExTestCase) ... ok > test_ud_traffic_cq_ex (tests.test_cqex.CqExTestCase) ... ok > test_xrc_traffic_cq_ex (tests.test_cqex.CqExTestCase) ... ok > test_create_dm (tests.test_device.DMTest) ... ok > test_create_dm_bad_flow (tests.test_device.DMTest) ... ok > test_destroy_dm (tests.test_device.DMTest) ... ok > test_destroy_dm_bad_flow (tests.test_device.DMTest) ... ok > test_dm_read (tests.test_device.DMTest) ... ok > test_dm_write (tests.test_device.DMTest) ... ok > test_dm_write_bad_flow (tests.test_device.DMTest) ... ok > test_dev_list (tests.test_device.DeviceTest) ... ok > test_open_dev (tests.test_device.DeviceTest) ... ok > test_query_device (tests.test_device.DeviceTest) ... ok > test_query_device_ex (tests.test_device.DeviceTest) ... ok > test_query_gid (tests.test_device.DeviceTest) ... ok > test_query_port (tests.test_device.DeviceTest) ... FAIL > test_query_port_bad_flow (tests.test_device.DeviceTest) ... ok > test_create_dm_mr (tests.test_mr.DMMRTest) ... ok > test_destroy_dm_mr (tests.test_mr.DMMRTest) ... ok > test_buffer (tests.test_mr.MRTest) ... ok > test_dereg_mr (tests.test_mr.MRTest) ... ok > test_dereg_mr_twice (tests.test_mr.MRTest) ... ok > test_lkey (tests.test_mr.MRTest) ... ok > test_read (tests.test_mr.MRTest) ... ok > test_reg_mr (tests.test_mr.MRTest) ... ok > test_reg_mr_bad_flags (tests.test_mr.MRTest) ... ok > test_reg_mr_bad_flow (tests.test_mr.MRTest) ... ok > test_rkey (tests.test_mr.MRTest) ... ok > test_write (tests.test_mr.MRTest) ... ok > test_dereg_mw_type1 (tests.test_mr.MWTest) ... ok > test_dereg_mw_type2 (tests.test_mr.MWTest) ... ok > test_reg_mw_type1 (tests.test_mr.MWTest) ... ok > test_reg_mw_type2 (tests.test_mr.MWTest) ... ok > test_reg_mw_wrong_type (tests.test_mr.MWTest) ... ok > test_odp_rc_traffic (tests.test_odp.OdpTestCase) ... ok > test_odp_ud_traffic (tests.test_odp.OdpTestCase) ... skipped 'ODP is not supported - ODP recv not supported' > test_odp_xrc_traffic (tests.test_odp.OdpTestCase) ... ok > test_default_allocators (tests.test_parent_domain.ParentDomainTestCase) ... ok > test_mem_align_allocators (tests.test_parent_domain.ParentDomainTestCase) ... ok > test_without_allocators (tests.test_parent_domain.ParentDomainTestCase) ... ok > test_alloc_pd (tests.test_pd.PDTest) ... ok > test_create_pd_none_ctx (tests.test_pd.PDTest) ... ok > test_dealloc_pd (tests.test_pd.PDTest) ... ok > test_destroy_pd_twice (tests.test_pd.PDTest) ... ok > test_multiple_pd_creation (tests.test_pd.PDTest) ... ok > test_create_qp_ex_no_attr (tests.test_qp.QPTest) ... ok > test_create_qp_ex_no_attr_connected (tests.test_qp.QPTest) ... ok > test_create_qp_ex_with_attr (tests.test_qp.QPTest) ... ok > test_create_qp_ex_with_attr_connected (tests.test_qp.QPTest) ... ok > test_create_qp_no_attr (tests.test_qp.QPTest) ... ok > test_create_qp_no_attr_connected (tests.test_qp.QPTest) ... ok > test_create_qp_with_attr (tests.test_qp.QPTest) ... ok > test_create_qp_with_attr_connected (tests.test_qp.QPTest) ... ok > test_modify_qp (tests.test_qp.QPTest) ... ok > test_query_qp (tests.test_qp.QPTest) ... ok > test_rdmacm_sync_traffic (tests.test_rdmacm.CMTestCase) ... skipped 'No devices with net interface' > > ====================================================================== > FAIL: test_query_port (tests.test_device.DeviceTest) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/kernel_work/rdma-core/tests/test_device.py", line 129, in test_query_port > self.verify_port_attr(port_attr) > File "/kernel_work/rdma-core/tests/test_device.py", line 113, in verify_port_attr > assert 'Invalid' not in d.speed_to_str(attr.active_speed) > AssertionError I'm very curious how did you get this assert "d.speed_to_str" covers all known speeds according to the IBTA. Thanks > > ---------------------------------------------------------------------- > Ran 67 tests in 10.058s > > FAILED (failures=1, skipped=3) > > > thanks, > -- > John Hubbard > NVIDIA