On 2/2/22 8:59 AM, Usama Arif wrote: > Acquire completion_lock at the start of __io_uring_register before > registering/unregistering eventfd and release it at the end. Hence > all calls to io_cqring_ev_posted which adds to the eventfd counter > will finish before acquiring the spin_lock in io_uring_register, and > all new calls will wait till the eventfd is registered. This avoids > ring quiesce which is much more expensive than acquiring the spin_lock. > > On the system tested with this patch, io_uring_reigster with > IORING_REGISTER_EVENTFD takes less than 1ms, compared to 15ms before. This seems like optimizing for the wrong thing, so I've got a few questions. Are you doing a lot of eventfd registrations (and unregister) in your workload? Or is it just the initial pain of registering one? In talking to Pavel, he suggested that RCU might be a good use case here, and I think so too. That would still remove the need to quiesce, and the posted side just needs a fairly cheap rcu read lock/unlock around it. -- Jens Axboe