On 8/11/23 18:12, Jens Axboe wrote:
io_uring currently uses percpu refcounts for the ring reference. This
works fine, but exiting a ring requires an RCU grace period to lapse
and this slows down ring exit quite a lot.

Add a basic per-cpu counter for our references instead, and use that.
This is in preparation for doing a sync wait on on any request (notably
file) references on ring exit. As we're going to be waiting on ctx refs
going away as well with that, the RCU grace period wait becomes a
noticeable slowdown.

How does it work?

- What prevents io_ring_ref_maybe_done() from miscalculating and either
1) firing while there are refs or
2) not triggering when we put down all refs?
E.g. percpu_ref relies on atomic counting after switching from
percpu mode.

- What contexts it can be used from? Task context only? I'll argue we
want to use it in [soft]irq for likes of *task_work_add().

Pavel Begunkov

