On 8/15/23 11:45 AM, Pavel Begunkov wrote: > On 8/11/23 18:12, Jens Axboe wrote: >> io_uring currently uses percpu refcounts for the ring reference. This >> works fine, but exiting a ring requires an RCU grace period to lapse >> and this slows down ring exit quite a lot. >> >> Add a basic per-cpu counter for our references instead, and use that. >> This is in preparation for doing a sync wait on on any request (notably >> file) references on ring exit. As we're going to be waiting on ctx refs >> going away as well with that, the RCU grace period wait becomes a >> noticeable slowdown. > > How does it work? > > - What prevents io_ring_ref_maybe_done() from miscalculating and either > 1) firing while there are refs or > 2) not triggering when we put down all refs? > E.g. percpu_ref relies on atomic counting after switching from > percpu mode. I'm open to critique of it, do you have any specific worries? The counters are per-cpu, and whenever the REF_DEAD_BIT is set, we sum on that drop. We should not be grabbing references post that, and any drop will just sum the counters. > - What contexts it can be used from? Task context only? I'll argue we > want to use it in [soft]irq for likes of *task_work_add(). We don't manipulate ctx refs from non-task context right now, or from hard/soft IRQ. On the task_work side, the request already has a reference to the ctx. Not sure why you'd want to add more. In any case, I prefer not to deal with hypotheticals, just the code we have now. -- Jens Axboe