On 6/5/24 16:11, Pavel Begunkov wrote:
On 6/4/24 20:01, Jens Axboe wrote:
io_uring currently uses percpu refcounts for the ring reference. This
works fine, but exiting a ring requires an RCU grace period to lapse
and this slows down ring exit quite a lot.
Add a basic per-cpu counter for our references instead, and use that.
All the synchronisation heavy lifting is done by RCU, what
makes it safe to read other CPUs counters in
io_ring_ref_maybe_done()?
Other options are expedited RCU (Paul saying it's an order of
magnitude faster), or to switch to plain atomics since it's cached,
but it's only good if submitter and waiter are the same task. Paul
also mentioned more elaborate approaches like percpu (to reduce
contention) atomics.
Let's say you have 1 ref, then:
CPU1: fallback: get_ref();
CPU2: put_ref(); io_ring_ref_maybe_done();
There should be 1 ref left but without extra sync
io_ring_ref_maybe_done() can read the old value from CPU1
before the get => UAF.
--
Pavel Begunkov