On 6/8/2021 7:22 PM, Pearson, Robert B wrote:
External email: Use caution opening links or attachments
On 6/8/2021 11:12 AM, Pearson, Robert B wrote:
On 6/8/2021 10:54 AM, Pearson, Robert B wrote:
On 6/8/2021 6:53 AM, Edward Srouji wrote:
On 6/8/2021 9:47 AM, Leon Romanovsky wrote:
On Mon, Jun 07, 2021 at 11:54:29PM -0500, Pearson, Robert B wrote:
On 6/7/2021 11:41 PM, Leon Romanovsky wrote:
On Mon, Jun 07, 2021 at 04:50:20PM -0500, Pearson, Robert B wrote:
sorry/this time without the HTML.
======================================================================
ERROR: test_qp_ex_rc_bind_mw (tests.test_qpex.QpExTestCase)
Verify bind memory window operation using the new post_send API.
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/rpearson/src/rdma-core/tests/test_qpex.py", line
292, in
test_qp_ex_rc_bind_mw
u.poll_cq(server.cq)
File "/home/rpearson/src/rdma-core/tests/utils.py", line
538, in poll_cq
raise PyverbsRDMAError('Completion status is {s}'.
pyverbs.pyverbs_error.PyverbsRDMAError: Completion status is
Memory window
bind error. Errno: 6, No such device or address
This test attempts to bind a type 2 MW to an MR that does not
have bind mw
access set and expects the test to succeed.
You're right, looks like a test bug. I'll send a fix upstream.
Can you please confirm that this solves your issue:
Well I get further. I am hitting a seg fault in python at
client.qp.wr_rdma_write(new_key, server.mr.buf)
in test_qp_ex_rc_bind_mw.
I'm trying to track it down. I'm not very familiar with python and
don't know how to run the test under gdb.
Thanks for the fix.
Bob
OK got it. In the setup for the test you write
class QpExRCBindMw(RCResources):
def create_qps(self):
create_qp_ex(self, e.IBV_QPT_RC, e.IBV_QP_EX_WITH_BIND_MW)
def create_mr(self):
self.mr = u.create_custom_mr(self,
e.IBV_ACCESS_REMOTE_WRITE |
e.IBV_ACCESS_MW_BIND)
which asks for qp_ex->wr_bind_mw() to be set but later in the test you
write
client.qp.wr_rdma_write(new_key, server.mr.buf)
which calls qp_ex->wr_rdma_write() which is not set causing the seg
fault. I think you should have written
create_qp_ex(self, e.IBV_QPT_RC, e.IBV_QP_EX_WITH_BIND_MW
| e.IBV_QP_EX_WITH_RDMA_WRITE)
since you need both extended QP operations.
Bob
With this patch the test is now running correctly
diff --git a/tests/test_qpex.py b/tests/test_qpex.py
index 20288d45..0316bfcb 100644
--- a/tests/test_qpex.py
+++ b/tests/test_qpex.py
@@ -146,10 +146,12 @@ class QpExRCAtomicFetchAdd(RCResources):
class QpExRCBindMw(RCResources):
def create_qps(self):
- create_qp_ex(self, e.IBV_QPT_RC, e.IBV_QP_EX_WITH_BIND_MW)
+ create_qp_ex(self, e.IBV_QPT_RC, e.IBV_QP_EX_WITH_BIND_MW |
+ e.IBV_QP_EX_WITH_RDMA_WRITE)
def create_mr(self):
- self.mr = u.create_custom_mr(self, e.IBV_ACCESS_REMOTE_WRITE)
+ self.mr = u.create_custom_mr(self, e.IBV_ACCESS_REMOTE_WRITE |
+ e.IBV_ACCESS_MW_BIND)
I've sent a fix patch for upstream (you can see at github).
class QpExTestCase(RDMATestCase):
diff --git a/tests/test_qpex.py b/tests/test_qpex.py
index 4b58260f..c2d67ee8 100644
--- a/tests/test_qpex.py
+++ b/tests/test_qpex.py
@@ -149,7 +149,7 @@ class QpExRCBindMw(RCResources):
create_qp_ex(self, e.IBV_QPT_RC, e.IBV_QP_EX_WITH_BIND_MW)
def create_mr(self):
- self.mr = u.create_custom_mr(self, e.IBV_ACCESS_REMOTE_WRITE)
+ self.mr = u.create_custom_mr(self,
e.IBV_ACCESS_REMOTE_WRITE | e.IBV_ACCESS_MW_BIND)
Does the test break after your MW series? Or will it break
not-merged
code yet?
Generally speaking, we expect that developers run rdma-core tests
and
fixed/extend them prior to the submission.
Thanks
Bob Pearson
Nope. I don't have real RNICs at home to test. But (see my note to
Zhu) the
non extended APIs do set the access flags correctly and the
extended test
case does not. The wr_bind_mw() function can't fix this for the
test case.
It has to set the access flags when it creates the MR and it
didn't. It is
possible that mlx5 doesn't check the bind access flag but that seems
unlikely.
mlx5 devices support MW 1 & 2 and kernel checks that only these types
can be accepted from the user space. This is why mlx5 doesn't need to
check access flags again.
903 static int ib_uverbs_alloc_mw(struct uverbs_attr_bundle
*attrs)
904 {
....
927 if (cmd.mw_type != IB_MW_TYPE_1 && cmd.mw_type !=
IB_MW_TYPE_2) {
928 ret = -EINVAL;
929 goto err_put;
930 }
Thanks
I see that mlx5 checks the access flags in userspace only if
MW_DEBUG is turned on (in set_bind_wr()).
I guess that's for the sake of performance, as it's part of the data
path.
Bob