On Fri, Jul 9, 2021 at 10:47 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Wed, Jun 30, 2021, David Matlack wrote: > > On Tue, Jun 29, 2021 at 08:10:37PM -0700, Venkatesh Srinivas wrote: > > > vcpu->cpu contains the last cpu a vcpu run/is running on; > > > kvm_vcpu_kick is used to 'kick' a guest vcpu by sending an IPI > > > to the last CPU; vcpu->cpu is updated on sched_in when a vcpu > > > moves to a new CPU, so it possible for the vcpu->cpu field to > > > be stale. > > This fails to document the actual bug being fixed, and why the fix is correct. > The fact that vcpu->cpu may be stale is itself not a bug, e.g. even with this > patch, the IPI can be sent to the "wrong", i.e. smp_send_reschedule() can still > consume a stale vcpu->cpu. > > The bug that's being fixed is that grabbing the potentially-stale vcpu->cpu > _before_ kvm_arch_vcpu_should_kick() can cause KVM to send an IPI to the wrong > CPU _and_ let the vCPU run longer than intended. The fix is correct because KVM > doesn't truly care about sending an IPI to the correct pCPU, it only cares about > about kicking the pCPU out of the guest. If the vCPU has exited and been loaded > on a different pCPU after kvm_arch_vcpu_should_kick(), then its mission has been > accomplished and the IPI (to the wrong pCPU) is truly spurious. Will improve the comment for V2. > > > kvm_vcpu_kick also read vcpu->cpu early with a plain read, > > > while vcpu->cpu could be concurrently written. This caused > > > a data race, found by kcsan: > > > > > > write to 0xffff8880274e8460 of 4 bytes by task 30718 on cpu 0: > > > kvm_arch_vcpu_load arch/x86/kvm/x86.c:4018 > > > kvm_sched_in virt/kvm/kvm_main.c:4864 > > > > > > vs > > > kvm_vcpu_kick virt/kvm/kvm_main.c:2909 > > > kvm_arch_async_page_present_queued arch/x86/kvm/x86.c:11287 > > > async_pf_execute virt/kvm/async_pf.c:79 > > > ... > > > > > > Use a READ_ONCE to atomically read vcpu->cpu and avoid the > > > data race. > > > > > > Found by kcsan & syzbot. > > > > > > Signed-off-by: Venkatesh Srinivas <venkateshs@xxxxxxxxxxxx> > > Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx> > > > Reviewed-by: David Matlack <dmatlack@xxxxxxxxxx> > > > > > --- > > > virt/kvm/kvm_main.c | 10 ++++++---- > > > 1 file changed, 6 insertions(+), 4 deletions(-) > > > > > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > > > index 46fb042837d2..0525f42afb7d 100644 > > > --- a/virt/kvm/kvm_main.c > > > +++ b/virt/kvm/kvm_main.c > > > @@ -3058,16 +3058,18 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_wake_up); > > > */ > > > void kvm_vcpu_kick(struct kvm_vcpu *vcpu) > > > { > > > - int me; > > > - int cpu = vcpu->cpu; > > > + int me, cpu; > > > > > > if (kvm_vcpu_wake_up(vcpu)) > > > return; > > > > > > me = get_cpu(); > > > - if (cpu != me && (unsigned)cpu < nr_cpu_ids && cpu_online(cpu)) > > > - if (kvm_arch_vcpu_should_kick(vcpu)) > > > + if (kvm_arch_vcpu_should_kick(vcpu)) { > > > + cpu = READ_ONCE(vcpu->cpu); > > > + WARN_ON_ONCE(cpu == me); > > > > nit: A comment here may be useful to explain the interaction with > > kvm_arch_vcpu_should_kick(). Took me a minute to understand why you > > added the warning. > > Agreed. Will do for V2 > > > + if ((unsigned)cpu < nr_cpu_ids && cpu_online(cpu)) > > > smp_send_reschedule(cpu); > > > + } > > > put_cpu(); > > > } > > > EXPORT_SYMBOL_GPL(kvm_vcpu_kick); > > > -- > > > 2.30.2 > > >