Re: [PATCH 1/3] io_uring: move to using private ring references

Jens Axboe <axboe@xxxxxxxxx> · Tue, 15 Aug 2023 15:45:04 -0600

On 8/15/23 11:45 AM, Pavel Begunkov wrote:
> On 8/11/23 18:12, Jens Axboe wrote:
>> io_uring currently uses percpu refcounts for the ring reference. This
>> works fine, but exiting a ring requires an RCU grace period to lapse
>> and this slows down ring exit quite a lot.
>>
>> Add a basic per-cpu counter for our references instead, and use that.
>> This is in preparation for doing a sync wait on on any request (notably
>> file) references on ring exit. As we're going to be waiting on ctx refs
>> going away as well with that, the RCU grace period wait becomes a
>> noticeable slowdown.
> 
> How does it work?
> 
> - What prevents io_ring_ref_maybe_done() from miscalculating and either
> 1) firing while there are refs or
> 2) not triggering when we put down all refs?
> E.g. percpu_ref relies on atomic counting after switching from
> percpu mode.

I'm open to critique of it, do you have any specific worries? The
counters are per-cpu, and whenever the REF_DEAD_BIT is set, we sum on
that drop. We should not be grabbing references post that, and any drop
will just sum the counters. 

> - What contexts it can be used from? Task context only? I'll argue we
> want to use it in [soft]irq for likes of *task_work_add().

We don't manipulate ctx refs from non-task context right now, or from
hard/soft IRQ. On the task_work side, the request already has a
reference to the ctx. Not sure why you'd want to add more. In any case,
I prefer not to deal with hypotheticals, just the code we have now.

-- 
Jens Axboe