Re: ibv_req_notify_cq clarification

Gal Pressman <galpress@xxxxxxxxxx> · Tue, 23 Feb 2021 14:18:31 +0200

On 22/02/2021 21:37, Jason Gunthorpe wrote:
> On Mon, Feb 22, 2021 at 09:24:18PM +0200, Gal Pressman wrote:
>> On 22/02/2021 17:55, Jason Gunthorpe wrote:
>>> On Mon, Feb 22, 2021 at 05:36:17PM +0200, Gal Pressman wrote:
>>>
>>>> "Mellanox HCAs keep track of the last index for which the user received an
>>>> event. Using this index, it is guaranteed that an event is generated immediately
>>>> when a request completion notification is performed and a CQE has already been
>>>> reported."
>>>
>>> I don't think verbs exposes this behavior.
>>
>> So in theory this hardware could generate events that the user app
>> doesn't expect?
> 
> Not really, it shuffles around how race conditions are resolved.
> 
>> But looking at ibv_ud_pingpong for example, I don't understand how
>> that could even work.  The test arms the CQ on creation (consumder
>> index 0), calls ibv_get_cq_event(), wakes up and immediately arms
>> the CQ again (before polling, consumer index is still 0).
> 
> By IBTA spec this is wrong and racey.

Where does it say that?

>> This means that the next ibv_get_cq_event() will wake up immediately, as the CQ
>> was armed twice with the same consumer index and the first completion already
>> exists. Surely that's not what's meant to happen?
> 
> If the driver for this HW stuffs the consumer index then yes, you'd
> get a wakeup as though the new entries were delivered instantly after
> the arm. If it stuffs the current producer index then you get behavior
> like IBTA describes.
> 
> I'd say they are both compatible ways to approach this, the app can't
> tell the state of the CQ, so it can't know if the new CQEs were
> delievered before or after it ARM'd it.

That's not accurate though.
A simple ping app that sends one packet at a time knows the CQ state when it
arms it. The CQ event is for that single sent packet, and the arm should notify
about the next packet it's about to send.

In this case, notifying twice for the same completion is not compatible with the
spec - there are no races in this case.

>> Do you have a way to verify whether this test gets stuck? Maybe I am
>> missing something?
> 
> If the mlx5 implementation is doing like you say then it will not get
> stuck on that HW, but it is not to spec
> 
>> What do you mean by arming a non-empty CQ?
> 
> The arm only has meaning if you know the CQ is empty because then you
> can reliably catch the empty->!empty transition, which is the whole
> point.

Well, you never really know the CQ is empty though. A completion can always
arrive just after you finish polling.

> If the CQ is non-empty then, by spec, if no new events arrive it will
> never be signaled - the app must re-poll it on its own so why arm it?
> 
>> The man pages suggest a scheme where the app should call ibv_get_cq_event()
>> followed by an ibv_req_notify_cq(), the CQ polling/emptying comes after these,
>> so at the time of arm the CQ isn't empty.
> 
> It doesn't show how to re-arm it seems
> 
> I'm repeating from memory here, I haven't checked the specs, but that
> Sean agrees seems like I'm remembering it right :)
> 
> The question you seem to be asking is what happens if you re-arm a
> non-empty CQ, do you immediately get an event or not? It should be
> easy enough to test on siw, rxe and mlx5 and see

rxe seems to get stuck, had some issues running siw and don't have an mlx5 nic
at hand.

> I expect by spec arming a non-empty CQ will not generate an event. The
> spec was written to support very simple hardware that would simply
> generate an event the next time it writes a CQE then atomically clear
> the arm.
> 
> Does look like a documentation update is in-order though!

Looking at libibverbs examples, pyverbs tests and perftest, they all act roughly
the same:

1. arm CQ
2. post send/recv
3. get event
4. arm CQ
5. poll
6. goto step 2

All of these apps always use the same construct of get event followed by an
immediate arm. Did all of these apps get it wrong?

Do you have an example of where it's done right :)?