> On Nov 19, 2019, at 12:01 PM, Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > On Mon, Nov 18, 2019 at 04:49:09PM -0500, Chuck Lever wrote: >> @@ -65,11 +68,35 @@ static void rdma_dim_init(struct ib_cq *cq) >> INIT_WORK(&dim->work, ib_cq_rdma_dim_work); >> } >> >> +/** >> + * ib_poll_cq - poll a CQ for completion(s) >> + * @cq: the CQ being polled >> + * @num_entries: maximum number of completions to return >> + * @wc: array of at least @num_entries &struct ib_wc where completions >> + * will be returned >> + * >> + * Poll a CQ for (possibly multiple) completions. If the return value >> + * is < 0, an error occurred. If the return value is >= 0, it is the >> + * number of completions returned. If the return value is >> + * non-negative and < num_entries, then the CQ was emptied. >> + */ >> +int ib_poll_cq(struct ib_cq *cq, int num_entries, struct ib_wc *wc) >> +{ >> + int rc; >> + >> + rc = cq->device->ops.poll_cq(cq, num_entries, wc); >> + trace_cq_poll(cq, num_entries, rc); >> + return rc; >> +} >> +EXPORT_SYMBOL(ib_poll_cq); > > Back to the non-inlined function? I never got a clear answer about your preference either way. IMO making this into a non-inline function is necessary to support either a static trace point here, or to have a place to put a convenient dynamic trace point via eBPF. I don't believe it will add noticeable overhead -- in particular, under heavy load, poll_cq is invoked once every 16 completions. On the other hand, it's not clear to me that the latency calculation will work correctly with callers outside of cq.c ... -- Chuck Lever