On 10/12/21 17:30, Logan Gunthorpe wrote:
Best I can see from the code is that someone is passing an sg_cnt of
zero. Previously that would have returned -ENOMEM, but now it might be
ignored, in which case it would hit that WARNING and return -EIO.
That is not what is happening. The debug patch shown below learned me
the following:
* The sg_cnt argument of rdma_rw_ctx_init() is not zero.
* After the rdma_rw_map_sgtable() call, sgt.nents is zero.
The debug patch that I used is as follows:
diff --git a/drivers/infiniband/core/rw.c b/drivers/infiniband/core/rw.c
index 5a3bd41b331c..a6dabea37958 100644
--- a/drivers/infiniband/core/rw.c
+++ b/drivers/infiniband/core/rw.c
@@ -326,11 +326,15 @@ int rdma_rw_ctx_init(struct rdma_rw_ctx *ctx,
struct ib_qp *qp, u32 port_num,
};
int ret;
+ WARN_ON_ONCE(!sg_cnt);
+
ret = rdma_rw_map_sgtable(dev, &sgt, dir);
if (ret)
return ret;
sg_cnt = sgt.nents;
+ WARN_ON_ONCE(!sg_cnt);
+
/*
* Skip to the S/G entry that sg_offset falls into:
*/
diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 3cadf1295417..d9e3d52eb952 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -911,11 +911,16 @@ static int srpt_alloc_rw_ctxs(struct
srpt_send_ioctx *ioctx,
u32 size = be32_to_cpu(db->len);
u32 rkey = be32_to_cpu(db->key);
+ WARN_ON_ONCE(!size);
+
ret = target_alloc_sgl(&ctx->sg, &ctx->nents, size,
false, i < nbufs - 1);
if (ret)
goto unwind;
+ WARN_ONCE(ctx->nents <= 0, "%u bytes -> %d entries\n",
+ size, ctx->nents);
+
ret = rdma_rw_ctx_init(&ctx->rw, ch->qp,
ch->sport->port,
ctx->sg, ctx->nents, 0, remote_addr,
rkey, dir);
if (ret < 0) {