On 22/02/2021 15:46, Jason Gunthorpe wrote: > On Sun, Feb 21, 2021 at 11:25:02AM +0200, Gal Pressman wrote: >> On 18/02/2021 18:23, Jason Gunthorpe wrote: >>> On Thu, Feb 18, 2021 at 05:52:16PM +0200, Gal Pressman wrote: >>>> On 18/02/2021 14:53, Jason Gunthorpe wrote: >>>>> On Thu, Feb 18, 2021 at 11:13:43AM +0200, Gal Pressman wrote: >>>>>> I'm a bit confused about the meaning of the ibv_req_notify_cq() verb: >>>>>> "Upon the addition of a new CQ entry (CQE) to cq, a completion event will be >>>>>> added to the completion channel associated with the CQ." >>>>>> >>>>>> What is considered a new CQE in this case? >>>>>> The next CQE from the user's perspective, i.e. any new CQE that wasn't consumed >>>>>> by the user's poll cq? >>>>>> Or any new CQE from the device's perspective? >>>>> >>>>> new CQE from the device perspective. >>>>> >>>>>> For example, if at the time of ibv_req_notify_cq() call the CQ has received 100 >>>>>> completions, but the user hasn't polled his CQ yet, when should he be notified? >>>>>> On the 101 completion or immediately (since there are completions waiting on the >>>>>> CQ)? >>>>> >>>>> 101 completion >>>>> >>>>> It is only meaningful to call it when the CQ is empty. >>>> >>>> Thanks, so there's an inherent race between the user's CQ poll and the next arm? >>> >>> I think the specs or man pages talk about this, the application has to >>> observe empty, do arm, then poll again then sleep on the cq if empty. >>> >>>> Do you know what's the purpose of the consumer index in the arm doorbell that's >>>> implemented by many providers? >>> >>> The consumer index is needed by HW to prevent CQ overflow, presumably >>> the drivers push to reduce the cases where the HW has to read it from >>> PCI >> >> Thanks, that makes sense. >> >> I found the following sentence in CX PRM: >> "If new CQEs are posted to the CQ after the reporting of a completion event and >> these CQEs are not yet consumed, then an event will be generated immediately >> after the request for notification is executed." >> >> Doesn't that contradict the expected behavior? > > I read it as confirming it? > > Only *new* CQEs trigger an event, and new CQE's always trigger an > event regardless of the full/empty state of the queue. > > This paragraph is an obtuse way of warning of the race I described. Hmm, yea this sentence is a bit confusing :).. "Mellanox HCAs keep track of the last index for which the user received an event. Using this index, it is guaranteed that an event is generated immediately when a request completion notification is performed and a CQE has already been reported." This also sounds weird, why is an event generated for a completion that has already been reported? So from my understanding of how this should work, the following code in perftest (ib_send_bw test) is buggy?: https://github.com/linux-rdma/perftest/blob/master/src/perftest_resources.c#L2955 Running this with 32 iterations, the client does something like: - arm cq - post send x 32 - wait for cq event - arm cq - poll cq (once, with batch size of 16) - no more post send (reached tot_iters) - wait for cq event (but an event has already been generated?) And gets stuck?