On 8/15/2018 12:55 AM, Jason Gunthorpe wrote:
On Tue, Aug 14, 2018 at 06:09:19PM +0300, Yishai Hadas wrote:
From: Artemy Kovalyov <artemyko@xxxxxxxxxxxx>
When regular CQ attempts to generate a CQE and the CQ is already full
overflow occurs and async error is generated.
On CQ created with IBV_CREATE_CQ_ATTR_IGNORE_OVERRUN flag overflow check
is disabled, error is never generated and CQE always will be written to
next entry.
Shortening fast-path message receive treatment allows low-latency
application to achieve better performance.
Signed-off-by: Artemy Kovalyov <artemyko@xxxxxxxxxxxx>
Signed-off-by: Yishai Hadas <yishaih@xxxxxxxxxxxx>
libibverbs/cmd_cq.c | 3 +++
libibverbs/man/ibv_create_cq_ex.3 | 1 +
libibverbs/verbs.h | 1 +
3 files changed, 5 insertions(+)
diff --git a/libibverbs/cmd_cq.c b/libibverbs/cmd_cq.c
index 73cd2f2..7669518 100644
+++ b/libibverbs/cmd_cq.c
@@ -142,6 +142,9 @@ int ibv_cmd_create_cq_ex(struct ibv_context *context,
if (cq_attr->wc_flags & IBV_WC_EX_WITH_COMPLETION_TIMESTAMP)
flags |= IB_UVERBS_CQ_FLAGS_TIMESTAMP_COMPLETION;
+ if (cq_attr->flags & IBV_CREATE_CQ_ATTR_IGNORE_OVERRUN)
+ flags |= IB_UVERBS_CQ_FLAGS_IGNORE_OVERRUN;
+
return ibv_icmd_create_cq(context, cq_attr->cqe, cq_attr->channel,
cq_attr->comp_vector, flags,
ibv_cq_ex_to_cq(cq), cmdb);
diff --git a/libibverbs/man/ibv_create_cq_ex.3 b/libibverbs/man/ibv_create_cq_ex.3
index 2abdbe4..5d39457 100644
+++ b/libibverbs/man/ibv_create_cq_ex.3
@@ -52,6 +52,7 @@ enum ibv_cq_init_attr_mask {
enum ibv_create_cq_attr_flags {
IBV_CREATE_CQ_ATTR_SINGLE_THREADED = 1 << 0, /* This CQ is used from a single threaded, thus no locking is required */
+ IBV_CREATE_CQ_ATTR_IGNORE_OVERRUN = 1 << 1, /* This CQ will not pass to error state if overrun, CQE always will be written to next entry */
};
This really needs a longer man page discussion. What exactly happens
at overflow? Does the CQ continue to work? Does it become corrupted?
Are CQEs just lost?
How should an application recover?
Probably stress that an application must be designed to avoid ever
overflowing the CQ.
Yes, this is correct, the application must be designed to avoid ever
overflowing the CQ, otherwise CQEs might be lost.
The PR [1] was updated in both man page and commit log for this patch to
clearly stress this point.
[1] https://github.com/linux-rdma/rdma-core/pull/369
Yishai