On 18.06.2011, at 10:35, Paul Mackerras wrote: > This adds support for KVM running on 64-bit Book 3S processors, > specifically POWER7, in hypervisor mode. Using hypervisor mode means > that the guest can use the processor's supervisor mode. That means > that the guest can execute privileged instructions and access privileged > registers itself without trapping to the host. This gives excellent > performance, but does mean that KVM cannot emulate a processor > architecture other than the one that the hardware implements. > > This code assumes that the guest is running paravirtualized using the > PAPR (Power Architecture Platform Requirements) interface, which is the > interface that IBM's PowerVM hypervisor uses. That means that existing > Linux distributions that run on IBM pSeries machines will also run > under KVM without modification. In order to communicate the PAPR > hypercalls to qemu, this adds a new KVM_EXIT_PAPR_HCALL exit code > to include/linux/kvm.h. > > Currently the choice between book3s_hv support and book3s_pr support > (i.e. the existing code, which runs the guest in user mode) has to be > made at kernel configuration time, so a given kernel binary can only > do one or the other. > > This new book3s_hv code doesn't support MMIO emulation at present. > Since we are running paravirtualized guests, this isn't a serious > restriction. > > With the guest running in supervisor mode, most exceptions go straight > to the guest. We will never get data or instruction storage or segment > interrupts, alignment interrupts, decrementer interrupts, program > interrupts, single-step interrupts, etc., coming to the hypervisor from > the guest. Therefore this introduces a new KVMTEST_NONHV macro for the > exception entry path so that we don't have to do the KVM test on entry > to those exception handlers. > > We do however get hypervisor decrementer, hypervisor data storage, > hypervisor instruction storage, and hypervisor emulation assist > interrupts, so we have to handle those. > > In hypervisor mode, real-mode accesses can access all of RAM, not just > a limited amount. Therefore we put all the guest state in the vcpu.arch > and use the shadow_vcpu in the PACA only for temporary scratch space. > We allocate the vcpu with kzalloc rather than vzalloc, and we don't use > anything in the kvmppc_vcpu_book3s struct, so we don't allocate it. > We don't have a shared page with the guest, but we still need a > kvm_vcpu_arch_shared struct to store the values of various registers, > so we include one in the vcpu_arch struct. > > The POWER7 processor has a restriction that all threads in a core have > to be in the same partition. MMU-on kernel code counts as a partition > (partition 0), so we have to do a partition switch on every entry to and > exit from the guest. At present we require the host and guest to run > in single-thread mode because of this hardware restriction. > > This code allocates a hashed page table for the guest and initializes > it with HPTEs for the guest's Virtual Real Memory Area (VRMA). We > require that the guest memory is allocated using 16MB huge pages, in > order to simplify the low-level memory management. This also means that > we can get away without tracking paging activity in the host for now, > since huge pages can't be paged or swapped. > > Signed-off-by: Paul Mackerras <paulus@xxxxxxxxx> > > diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig > index b7baff7..3662ecc 100644 > --- a/arch/powerpc/kvm/Kconfig > +++ b/arch/powerpc/kvm/Kconfig > @@ -20,7 +20,6 @@ config KVM > bool > select PREEMPT_NOTIFIERS > select ANON_INODES > - select KVM_MMIO > > config KVM_BOOK3S_HANDLER > bool > @@ -28,16 +27,22 @@ config KVM_BOOK3S_HANDLER > config KVM_BOOK3S_32_HANDLER > bool > select KVM_BOOK3S_HANDLER > + select KVM_MMIO > > config KVM_BOOK3S_64_HANDLER > bool > select KVM_BOOK3S_HANDLER > > +config KVM_BOOK3S_PR > + bool > + select KVM_MMIO > + > config KVM_BOOK3S_32 > tristate "KVM support for PowerPC book3s_32 processors" > depends on EXPERIMENTAL && PPC_BOOK3S_32 && !SMP && !PTE_64BIT > select KVM > select KVM_BOOK3S_32_HANDLER > + select KVM_BOOK3S_PR KVM_BOOK3S_32 is the equivalent to KVM_BOOK3S_64_PR, right? We should rename both or none to stay consistent. > ---help--- > Support running unmodified book3s_32 guest kernels > in virtual machines on book3s_32 host processors. > @@ -48,10 +53,38 @@ config KVM_BOOK3S_32 > If unsure, say N. > > config KVM_BOOK3S_64 > - tristate "KVM support for PowerPC book3s_64 processors" > + bool This means that if a user has selected KVM in his config before, it's unset after. Is there any good way to keep users from doing that? Maybe we could define KVM_BOOK3S_64 to be the PR type and introduce a new option that fulfills the role of KVM_BOOK3S_64 as you intended it here? Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html