Re: ibv_req_notify_cq clarification

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 23/02/2021 0:41, Hefty, Sean wrote:
>>> It eliminates races, if the consumer says 'I read up to X send me an
>>> interrupt if X+1 exists' when X+1 already exists if there is a race
>>> producer has already written it. So send an interrupt.
>>
>> Right, that's what I was getting at in my first question, this isn't the next
>> completion from the device's perspective.
> 
> From the spec:
> 
> Requests the CQ event handler be called when the next completion
> entry of the specified type is added to the specified CQ. The handler
> is called at most once per Request Completion Notification call for a
> particular CQ. Any CQ entries that existed before the notify is enabled
> will not result in a call to the handler.
> 
> It may help to think of it in terms of edge versus level triggered.  The IB spec defines the CQ as behaving similar to an edge triggered object.  I would guess that 99% of developers would successfully write an edge-triggered based application that hangs.  A level triggered wait is much easier to get correct.

Agree, but a level triggered wait contradicts the
"Any CQ entries that existed before the notify is enabled will not result in a
call to the handler."

>> So in such case the consumer index in the arm doorbell is used to indicate what
>> should be considered a "new" completion? This is new from the app perspective.
> 
> The verbs API only guarantees that the CQ will be signaled when a new completion relative to the HW has been written.  If there's a driver that converts the behavior into a level triggered operation, props to them!  That implementation is far more likely to make incorrectly written applications work than ever break a correctly written one.

I agree that this behavior is most likely better, but it contradicts the api
agreement.
It seems like you should write your app according to the provider you're using
in order to get it right.

>> But looking at ibv_ud_pingpong for example, I don't understand how that could
>> even work.
>> The test arms the CQ on creation (consumder index 0), calls ibv_get_cq_event(),
>> wakes up and immediately arms the CQ again (before polling, consumer index is
>> still 0).
> 
> In the worst case, arming the CQ just results in select/poll/epoll returning immediately the next time it is called.  The thread finds the CQ empty, rearms the CQ again, and select/poll/epoll finally block.
> 
>>>> Running this with 32 iterations, the client does something like:
>>>> - arm cq
>>>> - post send x 32
>>>> - wait for cq event
>>>> - arm cq
>>>> - poll cq (once, with batch size of 16)
>>>> - no more post send (reached tot_iters)
>>>> - wait for cq event (but an event has already been generated?)
>>>
>>> I don't know much about perf-test, but in verbs arming a non-empty CQ
>>> is asking for trouble
>>
>> Do you have a way to verify whether this test gets stuck? Maybe I am missing
>> something?
> 
> Looking at the code, I agree that it looks racy in a way that could cause it to hang.  My guess is most people running perftests are looking for the best performance numbers, so blocking completions are rarely enabled.  And even if they were, it's still a race condition as to whether the test would hang.
> 
>> What do you mean by arming a non-empty CQ?
>> The man pages suggest a scheme where the app should call ibv_get_cq_event()
>> followed by an ibv_req_notify_cq(), the CQ polling/emptying comes after these,
>> so at the time of arm the CQ isn't empty.
> 
> The app can do:
> 
> Wakeup
> While read cqe succeeds
> 	Process cqe
> Read cq event
> Arm cq
> /* cqe's may have been written between the last read and arming */
> While read cqe succeeds
> 	Process cqe
> Sleep
> 
> Or shorten this to:
> 
> Wakeup
> Read cq event
> Arm cq
> While read cqe succeeds
> 	Process cqe
> Sleep
> 
> In both cases, all cqe's must be processed after calling arm, and it's possible to read a cq event only to find the cq empty.  One could argue which is more efficient, but we're talking about a sleeping thread in either case.

Yea, this seems correct.
But as I said in my reply to Jason, libibverbs examples, pyverbs tests and
perftest all go from "Read cq event" to "Arm cq" immediately.



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux