On 02/17/2012 11:13 AM, Alexander Graf wrote: > From: Scott Wood <scottwood@xxxxxxxxxxxxx> > > Chips such as e500mc that implement category E.HV in Power ISA 2.06 > provide hardware virtualization features, including a new MSR mode for > guest state. The guest OS can perform many operations without trapping > into the hypervisor, including transitions to and from guest userspace. > > Since we can use SRR1[GS] to reliably tell whether an exception came from > guest state, instead of messing around with IVPR, we use DO_KVM similarly > to book3s. > > Current issues include: > - Machine checks from guest state are not routed to the host handler. > - The guest can cause a host oops by executing an emulated instruction > in a page that lacks read permission. Existing e500/4xx support has > the same problem. > > Includes work by Ashish Kalra <Ashish.Kalra@xxxxxxxxxxxxx>, > Varun Sethi <Varun.Sethi@xxxxxxxxxxxxx>, and > Liu Yu <yu.liu@xxxxxxxxxxxxx>. > > Signed-off-by: Scott Wood <scottwood@xxxxxxxxxxxxx> > [agraf: remove pt_regs usage] > Signed-off-by: Alexander Graf <agraf@xxxxxxx> > --- Thanks for picking this up! > +static unsigned long get_guest_esr(struct kvm_vcpu *vcpu) > +{ > +#ifdef CONFIG_KVM_BOOKE_HV > + return mfspr(SPRN_ESR); > +#else > + return vcpu->arch.shared->esr; > +#endif > +} s/SPRN_ESR/SPRN_GESR/ > int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, > unsigned int exit_nr) > { > - enum emulation_result er; > int r = RESUME_HOST; > > /* update before a new last_exit_type is rewritten */ > kvmppc_update_timing_stats(vcpu); > > + switch (exit_nr) { > + case BOOKE_INTERRUPT_EXTERNAL: > + do_IRQ(current->thread.regs); > + break; What will current->thread.regs point to here? Something on the stack from the last normal host exception entry? We probably want to create a pt_regs on the stack and at least provide PC, LR, and r1 for perfmon interrupts and such. > @@ -384,30 +558,56 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, > > switch (exit_nr) { > case BOOKE_INTERRUPT_MACHINE_CHECK: > - printk("MACHINE CHECK: %lx\n", mfspr(SPRN_MCSR)); > - kvmppc_dump_vcpu(vcpu); > - r = RESUME_HOST; > + kvm_resched(vcpu); > + r = RESUME_GUEST; > break; Leave this bit out (proper machine check handling will come later). > case BOOKE_INTERRUPT_PROGRAM: > - if (vcpu->arch.shared->msr & MSR_PR) { > + if (vcpu->arch.shared->msr & (MSR_PR | MSR_GS)) { > /* Program traps generated by user-level software must be handled > * by the guest kernel. */ > kvmppc_core_queue_program(vcpu, vcpu->arch.fault_esr); Should update the comment for why we're checking GS (i.e. we get a different trap for emulation with GS-mode). > +#define SET_VCPU(vcpu) \ > + PPC_STL vcpu, (THREAD + THREAD_KVM_VCPU)(r2) Change spaces to tab before PPC_STL > +#define LONGBYTES (BITS_PER_LONG / 8) > + > +#define VCPU_GPR(n) (VCPU_GPRS + (n * LONGBYTES)) > +#define VCPU_GUEST_SPRG(n) (VCPU_GUEST_SPRGS + (n * LONGBYTES)) > + > +/* The host stack layout: */ > +#define HOST_R1 (0 * LONGBYTES) /* Implied by stwu. */ > +#define HOST_CALLEE_LR (1 * LONGBYTES) > +#define HOST_RUN (2 * LONGBYTES) /* struct kvm_run */ > +/* > + * r2 is special: it holds 'current', and it made nonvolatile in the > + * kernel with the -ffixed-r2 gcc option. > + */ > +#define HOST_R2 (3 * LONGBYTES) > +#define HOST_NV_GPRS (4 * LONGBYTES) > +#define HOST_NV_GPR(n) (HOST_NV_GPRS + ((n - 14) * LONGBYTES)) > +#define HOST_MIN_STACK_SIZE (HOST_NV_GPR(31) + LONGBYTES) > +#define HOST_STACK_SIZE ((HOST_MIN_STACK_SIZE + 15) & ~15) /* Align. */ > +#define HOST_STACK_LR (HOST_STACK_SIZE + LONGBYTES) /* In caller stack frame. */ > + > +#define NEED_EMU 0x00000001 /* emulation -- save nv regs */ > +#define NEED_DEAR 0x00000002 /* save faulting DEAR */ > +#define NEED_ESR 0x00000004 /* save faulting ESR */ > + > +/* > + * On entry: > + * r4 = vcpu, r5 = srr0, r6 = srr1 > + * saved in vcpu: cr, ctr, r3-r13 > + */ > +.macro kvm_handler_common intno, srr0, flags > + mfspr r10, SPRN_PID > + lwz r8, VCPU_HOST_PID(r4) > + PPC_LL r11, VCPU_SHARED(r4) > + PPC_STL r14, VCPU_GPR(r14)(r4) /* We need a non-volatile GPR. */ > + li r14, \intno > + > + stw r10, VCPU_GUEST_PID(r4) > + mtspr SPRN_PID, r8 > + > + .if \flags & NEED_EMU > + lwz r9, VCPU_KVM(r4) > + .endif > + > +#ifdef CONFIG_KVM_EXIT_TIMING > + /* save exit time */ > +1: mfspr r7, SPRN_TBRU > + mfspr r8, SPRN_TBRL > + mfspr r9, SPRN_TBRU > + cmpw r9, r7 > + PPC_STL r8, VCPU_TIMING_EXIT_TBL(r4) > + bne- 1b > + PPC_STL r9, VCPU_TIMING_EXIT_TBU(r4) > +#endif As you pointed out to me last time, r9 is clobbered if exit timing is enabled (but see below, the load of VCPU_KVM can be removed along with the subsequent load of LVM_LPID(r9)). > + oris r8, r6, MSR_CE@h > +#ifndef CONFIG_64BIT > + stw r6, (VCPU_SHARED_MSR + 4)(r11) > +#else > + std r6, (VCPU_SHARED_MSR)(r11) > +#endif > + ori r8, r8, MSR_ME | MSR_RI > + PPC_STL r5, VCPU_PC(r4) > + > + /* > + * Make sure CE/ME/RI are set (if appropriate for exception type) > + * whether or not the guest had it set. Since mfmsr/mtmsr are > + * somewhat expensive, skip in the common case where the guest > + * had all these bits set (and thus they're still set if > + * appropriate for the exception type). > + */ > + cmpw r6, r8 > + .if \flags & NEED_EMU > + lwz r9, KVM_LPID(r9) > + .endif Where do we use r9? This is probably left over from something old. -Scott -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html