On 12/10/22 6:58 PM, Jens Axboe wrote: > On 12/10/22 11:51?AM, Linus Torvalds wrote: >> On Sat, Dec 10, 2022 at 7:36 AM Jens Axboe <axboe@xxxxxxxxx> wrote: >>> >>> This adds an epoll_ctl method for setting the minimum wait time for >>> retrieving events. >> >> So this is something very close to what the TTY layer has had forever, >> and is useful (well... *was* useful) for pretty much the same reason. >> >> However, let's learn from successful past interfaces: the tty layer >> doesn't have just VTIME, it has VMIN too. >> >> And I think they very much go hand in hand: you want for at least VMIN >> events or for at most VTIME after the last event. > > It has been suggested before too. A more modern example is how IRQ > coalescing works on eg nvme or nics. Those generally are of the nature > of "wait for X time, or until Y events are available". We can certainly > do something like that here too, it's just adding a minevents and > passing them in together. > > I'll add that, really should be trivial, and resend later in the merge > window once we're happy with that. Took a quick look, and it's not that trivial. The problem is you have to wake the task to reap events anyway, this cannot be checked at wakeup time. And now you lose the nice benefit of reducing the context switch rate, which was a good chunk of the win here... This can obviously very easily be done with io_uring, since that's how it already works in terms of waiting. The min-wait part was done separately there, though hasn't been posted or included upstream yet. So now we're a bit stuck... -- Jens Axboe