On Tue, 2024-07-30 at 21:25 +0100, Pavel Begunkov wrote: > > Removing an entry or two once every minute is definitely not > going to take 50% CPU, RCU machinery is running in background > regardless of whether io_uring uses it or not, and it's pretty > cheap considering ammortisation. > > If anything it more sounds from your explanation like the > scheduler makes a wrong decision and schedules out the sqpoll > thread even though it could continue to run, but that's need > a confirmation. Does the CPU your SQPOLL is pinned to stays > 100% utilised? Here are the facts as they are documented in the github issue. 1. despite thinking that I was doing NAPI busy polling, I was not because my ring was not receiving any sqe after its initial setup. This is what the patch developped with your input https://lore.kernel.org/io-uring/382791dc97d208d88ee31e5ebb5b661a0453fb79.1722374371.git.olivier@xxxxxxxxxxxxxx/T/#u is addressing (BTW, I should check if there is such a thing, but I would love to know if the net code is exposing a tracepoint when napi_busy_poll is called because it is very tricky to know if it is done for real or not) 2. the moment a second ring has been attached to the sqpoll thread that was receving a lot of sqe, the NAPI busy loop started to be made for real and the sqpoll cpu usage unexplicably dropped from 99% to 55% 3. here is my kernel cmdline: hugepages=72 isolcpus=0,1,2 nohz_full=0,1,2 rcu_nocbs=0,1,2 rcu_nocb_poll irqaffinity=3 idle=nomwait processor.max_cstate=1 intel_idle.max_cstate=1 nmi_watchdog=0 there is absolutely nothing else on CPU0 where the sqpoll thread affinity is set to run. 4. I got the idea of doing this: echo common_pid == sqpoll_pid > /sys/kernel/tracing/events/sched/filter echo 1 > /sys/kernel/tracing/events/sched/sched_switch/enable and I have recorded over 1,000 context switches in 23 seconds with RCU related kernel threads. 5. just for the fun of checking out, I have disabled NAPI polling on my io_uring rings and the sqpoll thread magically returned to 99% CPU usage from 55%... I am open to other explanations for what I have observed but my current conclusion is based on what I am able to see... the evidence appears very convincing to me... > > > > So is this a good idea in your opinion? > > I believe that's a good thing, I've been prototyping a similar > if not the same approach just today, i.e. user [un]registers > napi instance by id you can get with SO_INCOMING_NAPI_ID. > this is fantastic! I am super happy to see all this NAPI busy polling feature interest and activity which is a feature that I am very fond with (along with io_uring) I am looking forward collaborating with you Pavel to make io_uring the best NAPI busy polling goto solution! Greetings, Olivier