Re: [PATCH 3/3] KVM: ARM: Atomically read guest instruction,

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 23, 2012 at 3:20 AM, Rusty Russell <rusty@xxxxxxxxxxxxxxx> wrote:
> Ben Herrenschmidt pointed out a nasty race:
> (1) Guest complicated mmio instruction traps.
> (2) The hardware doesn't tell us enough, so we need to read the actual
>     instruction which was being exectuted.
> (3) KVM maps the instruction virtual address to a physical address.
> (4) The guest (SMP) swaps out that page, and fills it with something else.
> (5) We read the physical address, but now that's the wrong thing.
>
> This is so rare that we can stop all the vcpus from running when it
> happens (if no VCPUS are running, this just means getting and
> releasing a spinlock).
>
> Signed-off-by: Rusty Russell <rusty.russell@xxxxxxxxxx>
>
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 79b5318..1c0fa75 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -144,6 +144,9 @@ struct kvm_vcpu_arch {
>         int last_pcpu;
>         cpumask_t require_dcache_flush;
>
> +       /* Don't run the guest: see copy_current_insn() */
> +       bool pause;
> +
>         /* IO related fields */
>         bool mmio_sign_extend;  /* for byte/halfword loads */
>         u32 mmio_rd;
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 09a6800..404f4d4 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -614,7 +614,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>                 kvm_guest_enter();
>                 vcpu->mode = IN_GUEST_MODE;
>
> -               ret = __kvm_vcpu_run(vcpu);
> +               smp_mb(); /* set mode before reading vcpu->arch.pause */
> +               if (unlikely(vcpu->arch.pause)) {
> +                       /* This means ignore, try again. */
> +                       ret = ARM_EXCEPTION_IRQ;
> +               } else {
> +                       ret = __kvm_vcpu_run(vcpu);
> +               }

it makes me a little sad to wrap this important "I'm going to change
the world and run a VM function call" in the else-clause of a super
rare corner case, but I see it gets the job done nicely.

>
>                 vcpu->mode = OUTSIDE_GUEST_MODE;
>                 vcpu->arch.last_pcpu = smp_processor_id();
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 0043dbb..16256e9 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -636,6 +636,58 @@ static bool copy_from_guest_va(struct kvm_vcpu *vcpu,
>         return true;
>  }
>
> +/* Just ensure we're not running the guest. */
> +static void do_nothing(void *info)
> +{
> +}
> +
> +/*
> + * We have to be very careful copying memory from a running (ie. SMP) guest.
> + * Another CPU may remap the page (eg. swap out a userspace text page) as we
> + * read the instruction.  Unlike normal hardware operation, to emulate an
> + * instruction we map the virtual to physical address then read that memory
> + * as separate steps, thus not atomic.
> + *
> + * Fortunately this is so rare (we don't usually need the instruction), we
> + * can go very slowly and noone will mind.
> + */
> +static bool copy_current_insn(struct kvm_vcpu *vcpu,
> +                            unsigned long *instr, size_t instr_len)
> +{
> +       int i;
> +       bool ret;
> +       struct kvm_vcpu *v;
> +
> +       /* Don't cross with IPIs in kvm_main.c */
> +       spin_lock(&vcpu->kvm->mmu_lock);
> +
> +       /* Tell them all to pause, so no more will enter guest. */
> +       kvm_for_each_vcpu(i, v, vcpu->kvm)
> +               v->arch.pause = true;
> +
> +       /* Set ->pause before we read ->mode */
> +       smp_mb();
> +
> +       /* Kick out any which are still running. */
> +       kvm_for_each_vcpu(i, v, vcpu->kvm) {
> +               /* Guest could exit now, making cpu wrong. That's OK. */
> +               if (kvm_vcpu_exiting_guest_mode(v) == IN_GUEST_MODE)
> +                       smp_call_function_single(v->cpu, do_nothing, NULL, 1);
> +       }
> +
> +       /* Now guest isn't running, we can va->pa map and copy atomically. */
> +       ret = copy_from_guest_va(vcpu, instr, vcpu->arch.regs.pc, instr_len,
> +                                vcpu_mode_priv(vcpu));
> +
> +       /* Release them all. */
> +       kvm_for_each_vcpu(i, v, vcpu->kvm)
> +               v->arch.pause = false;
> +
> +       spin_unlock(&vcpu->kvm->mmu_lock);
> +
> +       return ret;
> +}
> +
>  /**
>   * invalid_io_mem_abort -- Handle I/O aborts ISV bit is clear
>   *
> @@ -660,8 +712,7 @@ static int invalid_io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
>         }
>
>         /* If it fails (SMP race?), we reenter guest for it to retry. */
> -       if (!copy_from_guest_va(vcpu, &instr, vcpu->arch.regs.pc, sizeof(instr),
> -                               vcpu_mode_priv(vcpu)))
> +       if (!copy_current_insn(vcpu, &instr, sizeof(instr)))
>                 return 1;
>
>         return kvm_emulate_mmio_ls(vcpu, fault_ipa, instr);

otherwise looks good. we should have a trace event for when this race
ever happens and have a big party when it does!

-Christoffer
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm


[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux