If an application creates its cq for DIRECT poll mode using ib_create_cq()
instead of ib_alloc_cq(), and then uses ib_drain_qp() to drain its qp,
ib_drain_sq/rq() will always hang forever because cq->wc is NULL. IE
ib_create_cq() doesn't allocate cq->wc, and ib_alloc_cq() does. Yet the
__ib_process_cq() requires cq->wc to actually complete any completions
and
calling the cqe_done function.
Is this a bug in the CQ core code or the application?
Take a look in __ib_drain_rq/__ib_drain_sq for
cq->poll_ctx == IB_POLL_DIRECT. The drain routine polls
the completion queue from time to time...
Yes, but it ends up calling __ib_process_cq() which doesn't actually poll the CQ because cq->wc is NULL.
Do you mean that the CQ allocation wasn't done with ib_alloc_cq? That
indeed would be a bug. We can WARN on it as well so the application
will know to allocate its CQ with ib_alloc_cq.
Does something like this makes sense?
--
diff --git a/drivers/infiniband/core/cq.c b/drivers/infiniband/core/cq.c
index f2ae75fa3128..90eac56b5f1a 100644
--- a/drivers/infiniband/core/cq.c
+++ b/drivers/infiniband/core/cq.c
@@ -69,7 +69,7 @@ static int __ib_process_cq(struct ib_cq *cq, int budget)
*/
int ib_process_cq_direct(struct ib_cq *cq, int budget)
{
- WARN_ON_ONCE(cq->poll_ctx != IB_POLL_DIRECT);
+ WARN_ON_ONCE(cq->poll_ctx != IB_POLL_DIRECT || !cq->wc);
return __ib_process_cq(cq, budget);
}
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html