On 9/24/19 3:33 AM, Pavel Begunkov wrote: > > > On 24/09/2019 11:36, Jens Axboe wrote: >> On 9/24/19 2:27 AM, Jens Axboe wrote: >>> On 9/24/19 2:02 AM, Jens Axboe wrote: >>>> On 9/24/19 1:06 AM, Pavel Begunkov wrote: >>>>> On 24/09/2019 02:00, Jens Axboe wrote: >>>>>>> I think we can do the same thing, just wrapping the waitqueue in a >>>>>>> structure with a count in it, on the stack. Got some flight time >>>>>>> coming up later today, let me try and cook up a patch. >>>>>> >>>>>> Totally untested, and sent out 5 min before departure... But something >>>>>> like this. >>>>> Hmm, reminds me my first version. Basically that's the same thing but >>>>> with macroses inlined. I wanted to make it reusable and self-contained, >>>>> though. >>>>> >>>>> If you don't think it could be useful in other places, sure, we could do >>>>> something like that. Is that so? >>>> >>>> I totally agree it could be useful in other places. Maybe formalized and >>>> used with wake_up_nr() instead of adding a new primitive? Haven't looked >>>> into that, I may be talking nonsense. >>>> >>>> In any case, I did get a chance to test it and it works for me. Here's >>>> the "finished" version, slightly cleaned up and with a comment added >>>> for good measure. >>> >>> Notes: >>> >>> This version gets the ordering right, you need exclusive waits to get >>> fifo ordering on the waitqueue. >>> >>> Both versions (yours and mine) suffer from the problem of potentially >>> waking too many. I don't think this is a real issue, as generally we >>> don't do threaded access to the io_urings. But if you had the following >>> tasks wait on the cqring: >>> >>> [min_events = 32], [min_events = 8], [min_events = 8] >>> >>> and we reach the io_cqring_events() == threshold, we'll wake all three. >>> I don't see a good solution to this, so I suspect we just live with >>> until proven an issue. Both versions are much better than what we have >>> now. >> >> Forgot an issue around signal handling, version below adds the >> right check for that too. > > It seems to be a good reason to not keep reimplementing > "prepare_to_wait*() + wait loop" every time, but keep it in sched :) I think if we do the ->private cleanup that Peter mentioned, then there's not much left in terms of consolidation. Not convinced the case is interesting enough to warrant a special helper. If others show up, it's easy enough to consolidate the use cases and unify them. If you look at wake_up_nr(), I would have thought that would be more widespread. But it really isn't. >> Curious what your test case was for this? > You mean a performance test case? It's briefly described in a comment > for the second patch. That's just rewritten io_uring-bench, with > 1. a thread generating 1 request per call in a loop > 2. and the second thread waiting for ~128 events. > Both are pinned to the same core. Gotcha, thanks. -- Jens Axboe