Jens Axboe <axboe@xxxxxxxxx> writes: > On 6/12/23 10:06?AM, Gabriel Krisman Bertazi wrote: >> Jens Axboe <axboe@xxxxxxxxx> writes: >> >>> Add support for FUTEX_WAKE/WAIT primitives. >> >> This is great. I was so sure io_uring had this support already for some >> reason. I might have dreamed it. > > I think you did :-) Premonitory! Still, there should be better things to dream about than with the kernel code. >> Even with an asynchronous model, it might make sense to halt execution >> of further queued operations until futex completes. I think >> IOSQE_IO_DRAIN is a barrier only against the submission part, so it >> wouldn't hep. Is there a way to ensure this ordering? > > You'd use link for that - link whatever depends on the wake to the futex > wait. Or just queue it up once you reap the wait completion, when that > is posted because we got woken. The challenge of linked requests, in my opinion, is that once a link chain starts, everything needs to be link together, and a single error fails everything, which is ok when operations are related, but not so much when doing IO to different files from the same ring. >>> Cancelations are supported, both from the application point-of-view, >>> but also to be able to cancel pending waits if the ring exits before >>> all events have occurred. >>> >>> This is just the barebones wait/wake support. Features to be added >>> later: >> >> One item high on my wishlist would be the futexv semantics (wait on any >> of a set of futexes). It cannot be implemented by issuing several >> FUTEX_WAIT. > > Yep, I do think that one is interesting enough to consider upfront. >Unfortunately the internal implementation of that does not look that >great, though I'm sure we can make that work. ? But would likely >require some futexv refactoring to make it work. I can take a look at >it. No disagreement here. To be fair, the main challenge was making the new interface compatible with a futex being waited on/waked the original interface. At some point, we had a really nice design for a single object, but we spent two years bikesheding over the interface and ended up merging something pretty much similar to the proposal from two years prior. > You could obviously do futexv with this patchset, just posting N futex > waits and canceling N-1 when you get woken by one. Though that's of > course not very pretty or nice to use, but design wise it would totally > work as you don't actually block on these with io_uring. Yes, but at that point, i guess it'd make more sense to implement the same semantics by polling over a set of eventfds or having a single futex and doing dispatch in userspace. thanks, -- Gabriel Krisman Bertazi