> There's no need to protect reading the per-cpu variable with the spinlock, only > walking the list needs to be protected. E.g. the code can be compacted to: Thanks > > > > > > What's the difference in the generated code? > > > > > > > This reduces one fifth asm codes > > ... > > > these is a similar patch 031e3bd8986fffe31e1ddbf5264cccfe30c9abd7 > > Is there a measurable performance improvement though? I don't dislike the > patch, but it probably falls into the "technically an optimization but no one will > ever notice" category. There is small performance improvement when "perf bench sched pipe" is tested on IPI virtualization supported cpu and halt/mwait donot passthrough to vm (https://patchwork.kernel.org/project/kvm/patch/20220304080725.18135-9-guang.zeng@xxxxxxxxx/ are included) Thanks -Li