On Sat, Apr 02, 2022, Li,Rongqing wrote: > > 发件人: Paolo Bonzini <paolo.bonzini@xxxxxxxxx> 代表 Paolo Bonzini > > On 4/2/22 06:01, Li RongQing wrote: > > > pi_wakeup_handler is used to wakeup the sleep vCPUs by posted irq > > > list_for_each_entry is used in it, and whose input is other function > > > per_cpu(), That cause that per_cpu() be invoked at least twice when > > > there is one sleep vCPU > > > > > > so optimize pi_wakeup_handler it by reading once which is safe in > > > spinlock protection There's no need to protect reading the per-cpu variable with the spinlock, only walking the list needs to be protected. E.g. the code can be compacted to: int cpu = smp_processor_id(); raw_spinlock_t *spinlock = &per_cpu(wakeup_vcpus_on_cpu_lock, cpu); struct list_head *wakeup_list = &per_cpu(wakeup_vcpus_on_cpu, cpu); struct vcpu_vmx *vmx; raw_spin_lock(spinlock); list_for_each_entry(vmx, wakeup_list, pi_wakeup_list) { if (pi_test_on(&vmx->pi_desc)) kvm_vcpu_wake_up(&vmx->vcpu); } raw_spin_unlock(spinlock); > > > > > > and same to per CPU spinlock > > > > What's the difference in the generated code? > > > > This reduces one fifth asm codes ... > these is a similar patch 031e3bd8986fffe31e1ddbf5264cccfe30c9abd7 Is there a measurable performance improvement though? I don't dislike the patch, but it probably falls into the "technically an optimization but no one will ever notice" category.