Re: [PATCH v8 07/15] ARM: KVM: Hypervisor inititalization

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 28, 2012 at 6:35 PM, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote:
> On Fri, Jun 15, 2012 at 03:07:59PM -0400, Christoffer Dall wrote:
>> Sets up the required registers to run code in HYP-mode from the kernel.
>>
>> By setting the HVBAR the kernel can execute code in Hyp-mode with
>> the MMU disabled. The HVBAR initially points to initialization code,
>> which initializes other Hyp-mode registers and enables the MMU
>> for Hyp-mode. Afterwards, the HVBAR is changed to point to KVM
>> Hyp vectors used to catch guest faults and to switch to Hyp mode
>> to perform a world-switch into a KVM guest.
>>
>> Also provides memory mapping code to map required code pages and data
>> structures accessed in Hyp mode at the same virtual address as the
>> host kernel virtual addresses, but which conforms to the architectural
>> requirements for translations in Hyp mode. This interface is added in
>> arch/arm/kvm/arm_mmu.c and is comprised of:
>>  - create_hyp_mappings(hyp_pgd, start, end);
>>  - free_hyp_pmds(pgd_hyp);
>>
>> Note: The initialization mechanism currently relies on an SMC #0 call
>> to the secure monitor, which was merely a fast way of getting to the
>> hypervisor. Dave Marting and Rusty Russel have patches out to make the
>> boot-wrapper and the kernel boot in Hyp-mode and setup a generic way for
>> hypervisors to get access to Hyp-mode if the boot-loader allows such
>> access.
>>
>> Signed-off-by: Christoffer Dall <c.dall@xxxxxxxxxxxxxxxxxxxxxx>
>> ---
>>  arch/arm/include/asm/kvm_arm.h              |  117 +++++++++++++++++++
>>  arch/arm/include/asm/kvm_asm.h              |   22 +++
>>  arch/arm/include/asm/kvm_mmu.h              |   37 ++++++
>>  arch/arm/include/asm/pgtable-3level-hwdef.h |    4 +
>>  arch/arm/include/asm/pgtable-3level.h       |    4 +
>>  arch/arm/include/asm/pgtable.h              |    1
>>  arch/arm/kvm/arm.c                          |  167 +++++++++++++++++++++++++++
>>  arch/arm/kvm/exports.c                      |   15 ++
>>  arch/arm/kvm/init.S                         |   99 ++++++++++++++++
>>  arch/arm/kvm/interrupts.S                   |   47 +++++++
>>  arch/arm/kvm/mmu.c                          |  170 +++++++++++++++++++++++++++
>>  mm/memory.c                                 |    2
>>  12 files changed, 685 insertions(+)
>>  create mode 100644 arch/arm/include/asm/kvm_arm.h
>>  create mode 100644 arch/arm/include/asm/kvm_mmu.h
>>
>> diff --git a/arch/arm/include/asm/kvm_arm.h b/arch/arm/include/asm/kvm_arm.h
>> new file mode 100644
>> index 0000000..7f30cbd
>> --- /dev/null
>> +++ b/arch/arm/include/asm/kvm_arm.h
>> @@ -0,0 +1,117 @@
>> +/*
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License, version 2, as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, write to the Free Software
>> + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
>> + *
>> + */
>> +
>> +#ifndef __KVM_ARM_H__
>> +#define __KVM_ARM_H__
>> +
>> +#include <asm/types.h>
>> +
>> +/* Hyp Configuration Register (HCR) bits */
>> +#define HCR_TGE              (1 << 27)
>> +#define HCR_TVM              (1 << 26)
>> +#define HCR_TTLB     (1 << 25)
>> +#define HCR_TPU              (1 << 24)
>> +#define HCR_TPC              (1 << 23)
>> +#define HCR_TSW              (1 << 22)
>> +#define HCR_TAC              (1 << 21)
>> +#define HCR_TIDCP    (1 << 20)
>> +#define HCR_TSC              (1 << 19)
>> +#define HCR_TID3     (1 << 18)
>> +#define HCR_TID2     (1 << 17)
>> +#define HCR_TID1     (1 << 16)
>> +#define HCR_TID0     (1 << 15)
>> +#define HCR_TWE              (1 << 14)
>> +#define HCR_TWI              (1 << 13)
>> +#define HCR_DC               (1 << 12)
>> +#define HCR_BSU              (3 << 10)
>> +#define HCR_BSU_IS   (1 << 10)
>> +#define HCR_FB               (1 << 9)
>> +#define HCR_VA               (1 << 8)
>> +#define HCR_VI               (1 << 7)
>> +#define HCR_VF               (1 << 6)
>> +#define HCR_AMO              (1 << 5)
>> +#define HCR_IMO              (1 << 4)
>> +#define HCR_FMO              (1 << 3)
>> +#define HCR_PTW              (1 << 2)
>> +#define HCR_SWIO     (1 << 1)
>> +#define HCR_VM               1
>> +
>> +/*
>> + * The bits we set in HCR:
>> + * TAC:              Trap ACTLR
>> + * TSC:              Trap SMC
>> + * TWI:              Trap WFI
>> + * BSU_IS:   Upgrade barriers to the inner shareable domain
>> + * FB:               Force broadcast of all maintainance operations
>> + * AMO:              Override CPSR.A and enable signaling with VA
>> + * IMO:              Override CPSR.I and enable signaling with VI
>> + * FMO:              Override CPSR.F and enable signaling with VF
>> + * SWIO:     Turn set/way invalidates into set/way clean+invalidate
>> + */
>> +#define HCR_GUEST_MASK (HCR_TSC | HCR_TWI | HCR_VM | HCR_BSU_IS | HCR_FB | \
>> +                     HCR_AMO | HCR_IMO | HCR_FMO | HCR_FMO | HCR_SWIO)
>> +
>> +/* Hyp System Control Register (HSCTLR) bits */
>> +#define HSCTLR_TE    (1 << 30)
>> +#define HSCTLR_EE    (1 << 25)
>> +#define HSCTLR_FI    (1 << 21)
>> +#define HSCTLR_WXN   (1 << 19)
>> +#define HSCTLR_I     (1 << 12)
>> +#define HSCTLR_C     (1 << 2)
>> +#define HSCTLR_A     (1 << 1)
>> +#define HSCTLR_M     1
>> +#define HSCTLR_MASK  (HSCTLR_M | HSCTLR_A | HSCTLR_C | HSCTLR_I | \
>> +                      HSCTLR_WXN | HSCTLR_FI | HSCTLR_EE | HSCTLR_TE)
>> +
>> +/* TTBCR and HTCR Registers bits */
>> +#define TTBCR_EAE    (1 << 31)
>> +#define TTBCR_IMP    (1 << 30)
>> +#define TTBCR_SH1    (3 << 28)
>> +#define TTBCR_ORGN1  (3 << 26)
>> +#define TTBCR_IRGN1  (3 << 24)
>> +#define TTBCR_EPD1   (1 << 23)
>> +#define TTBCR_A1     (1 << 22)
>> +#define TTBCR_T1SZ   (3 << 16)
>> +#define TTBCR_SH0    (3 << 12)
>> +#define TTBCR_ORGN0  (3 << 10)
>> +#define TTBCR_IRGN0  (3 << 8)
>> +#define TTBCR_EPD0   (1 << 7)
>> +#define TTBCR_T0SZ   3
>> +#define HTCR_MASK    (TTBCR_T0SZ | TTBCR_IRGN0 | TTBCR_ORGN0 | TTBCR_SH0)
>> +
>> +
>> +/* Virtualization Translation Control Register (VTCR) bits */
>> +#define VTCR_SH0     (3 << 12)
>> +#define VTCR_ORGN0   (3 << 10)
>> +#define VTCR_IRGN0   (3 << 8)
>> +#define VTCR_SL0     (3 << 6)
>> +#define VTCR_S               (1 << 4)
>> +#define VTCR_T0SZ    3
>> +#define VTCR_MASK    (VTCR_SH0 | VTCR_ORGN0 | VTCR_IRGN0 | VTCR_SL0 | \
>> +                      VTCR_S | VTCR_T0SZ | VTCR_MASK)
>> +#define VTCR_HTCR_SH (VTCR_SH0 | VTCR_ORGN0 | VTCR_IRGN0)
>> +#define VTCR_SL_L2   0               /* Starting-level: 2 */
>> +#define VTCR_SL_L1   (1 << 6)        /* Starting-level: 1 */
>> +#define VTCR_GUEST_SL        VTCR_SL_L1
>> +#define VTCR_GUEST_T0SZ      0
>> +#if VTCR_GUEST_SL == 0
>> +#define VTTBR_X              (14 - VTCR_GUEST_T0SZ)
>> +#else
>> +#define VTTBR_X              (5 - VTCR_GUEST_T0SZ)
>> +#endif
>> +
>> +
>> +#endif /* __KVM_ARM_H__ */
>> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
>> index c3d4458..69afdf3 100644
>> --- a/arch/arm/include/asm/kvm_asm.h
>> +++ b/arch/arm/include/asm/kvm_asm.h
>> @@ -24,5 +24,27 @@
>>  #define ARM_EXCEPTION_DATA_ABORT  4
>>  #define ARM_EXCEPTION_IRQ      5
>>  #define ARM_EXCEPTION_FIQ      6
>> +#define ARM_EXCEPTION_HVC      7
>> +
>> +/*
>> + * SMC Hypervisor API call number
>> + */
>> +#define SMCHYP_HVBAR_W 0xfffffff0
>> +
>> +#ifndef __ASSEMBLY__
>> +struct kvm_vcpu;
>> +
>> +extern char __kvm_hyp_init[];
>> +extern char __kvm_hyp_init_end[];
>> +
>> +extern char __kvm_hyp_vector[];
>> +
>> +extern char __kvm_hyp_code_start[];
>> +extern char __kvm_hyp_code_end[];
>> +
>> +extern void __kvm_flush_vm_context(void);
>> +
>> +extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>> +#endif
>>
>>  #endif /* __ARM_KVM_ASM_H__ */
>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>> new file mode 100644
>> index 0000000..1aa1af4
>> --- /dev/null
>> +++ b/arch/arm/include/asm/kvm_mmu.h
>> @@ -0,0 +1,37 @@
>> +/*
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License, version 2, as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, write to the Free Software
>> + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
>> + *
>> + */
>> +
>> +#ifndef __ARM_KVM_MMU_H__
>> +#define __ARM_KVM_MMU_H__
>> +
>> +/*
>> + * The architecture supports 40-bit IPA as input to the 2nd stage translations
>> + * and PTRS_PER_PGD2 could therefore be 1024.
>> + *
>> + * To save a bit of memory and to avoid alignment issues we assume 39-bit IPA
>> + * for now, but remember that the level-1 table must be aligned to its size.
>> + */
>> +#define PTRS_PER_PGD2        512
>> +#define PGD2_ORDER   get_order(PTRS_PER_PGD2 * sizeof(pgd_t))
>> +
>> +int create_hyp_mappings(void *from, void *to);
>> +void free_hyp_pmds(void);
>> +
>> +int kvm_hyp_pgd_alloc(void);
>> +pgd_t *kvm_hyp_pgd_get(void);
>> +void kvm_hyp_pgd_free(void);
>> +
>> +#endif /* __ARM_KVM_MMU_H__ */
>> diff --git a/arch/arm/include/asm/pgtable-3level-hwdef.h b/arch/arm/include/asm/pgtable-3level-hwdef.h
>> index a2d404e..18f5cef 100644
>> --- a/arch/arm/include/asm/pgtable-3level-hwdef.h
>> +++ b/arch/arm/include/asm/pgtable-3level-hwdef.h
>> @@ -32,6 +32,9 @@
>>  #define PMD_TYPE_SECT                (_AT(pmdval_t, 1) << 0)
>>  #define PMD_BIT4             (_AT(pmdval_t, 0))
>>  #define PMD_DOMAIN(x)                (_AT(pmdval_t, 0))
>> +#define PMD_APTABLE_SHIFT    (61)
>> +#define PMD_APTABLE          (_AT(pgdval_t, 3) << PGD_APTABLE_SHIFT)
>> +#define PMD_PXNTABLE         (_AT(pgdval_t, 1) << 59)
>>
>>  /*
>>   *   - section
>> @@ -41,6 +44,7 @@
>>  #define PMD_SECT_S           (_AT(pmdval_t, 3) << 8)
>>  #define PMD_SECT_AF          (_AT(pmdval_t, 1) << 10)
>>  #define PMD_SECT_nG          (_AT(pmdval_t, 1) << 11)
>> +#define PMD_SECT_PXN         (_AT(pmdval_t, 1) << 53)
>>  #define PMD_SECT_XN          (_AT(pmdval_t, 1) << 54)
>>  #define PMD_SECT_AP_WRITE    (_AT(pmdval_t, 0))
>>  #define PMD_SECT_AP_READ     (_AT(pmdval_t, 0))
>> diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
>> index b249035..1169a8a 100644
>> --- a/arch/arm/include/asm/pgtable-3level.h
>> +++ b/arch/arm/include/asm/pgtable-3level.h
>> @@ -107,6 +107,10 @@
>>  #define pud_none(pud)                (!pud_val(pud))
>>  #define pud_bad(pud)         (!(pud_val(pud) & 2))
>>  #define pud_present(pud)     (pud_val(pud))
>> +#define pmd_table(pmd)               ((pmd_val(pmd) & PMD_TYPE_MASK) == \
>> +                                              PMD_TYPE_TABLE)
>> +#define pmd_sect(pmd)                ((pmd_val(pmd) & PMD_TYPE_MASK) == \
>> +                                              PMD_TYPE_SECT)
>>
>>  #define pud_clear(pudp)                      \
>>       do {                            \
>> diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
>> index c7bd809..4b72287 100644
>> --- a/arch/arm/include/asm/pgtable.h
>> +++ b/arch/arm/include/asm/pgtable.h
>> @@ -82,6 +82,7 @@ extern pgprot_t             pgprot_kernel;
>>  #define PAGE_READONLY_EXEC   _MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_RDONLY)
>>  #define PAGE_KERNEL          _MOD_PROT(pgprot_kernel, L_PTE_XN)
>>  #define PAGE_KERNEL_EXEC     pgprot_kernel
>> +#define PAGE_HYP             _MOD_PROT(pgprot_kernel, L_PTE_USER)
>>
>>  #define __PAGE_NONE          __pgprot(_L_PTE_DEFAULT | L_PTE_RDONLY | L_PTE_XN)
>>  #define __PAGE_SHARED                __pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_XN)
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index 5992d90..4c61d3c 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -31,6 +31,12 @@
>>  #include <asm/uaccess.h>
>>  #include <asm/ptrace.h>
>>  #include <asm/mman.h>
>> +#include <asm/tlbflush.h>
>> +#include <asm/kvm_arm.h>
>> +#include <asm/kvm_asm.h>
>> +#include <asm/kvm_mmu.h>
>> +
>> +static DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
>>
>>  int kvm_arch_hardware_enable(void *garbage)
>>  {
>> @@ -255,13 +261,174 @@ long kvm_arch_vm_ioctl(struct file *filp,
>>       return -EINVAL;
>>  }
>>
>> +static void cpu_set_vector(void *vector)
>> +{
>> +     unsigned long vector_ptr;
>> +     unsigned long smc_hyp_nr;
>> +
>> +     vector_ptr = (unsigned long)vector;
>> +     smc_hyp_nr = SMCHYP_HVBAR_W;
>> +
>> +     /*
>> +      * Set the HVBAR
>> +      */
>> +     asm volatile (
>> +             "mov    r0, %[vector_ptr]\n\t"
>> +             "mov    r7, %[smc_hyp_nr]\n\t"
>> +             "smc    #0\n\t" : :
>> +             [vector_ptr] "r" (vector_ptr),
>> +             [smc_hyp_nr] "r" (smc_hyp_nr) :
>> +             "r0", "r7");
>> +}
>> +
>> +static void cpu_init_hyp_mode(void *vector)
>> +{
>> +     unsigned long pgd_ptr;
>> +     unsigned long hyp_stack_ptr;
>> +     unsigned long stack_page;
>> +
>> +     cpu_set_vector(vector);
>> +
>> +     pgd_ptr = virt_to_phys(kvm_hyp_pgd_get());
>> +     stack_page = __get_cpu_var(kvm_arm_hyp_stack_page);
>> +     hyp_stack_ptr = stack_page + PAGE_SIZE;
>> +
>> +     /*
>> +      * Call initialization code
>> +      */
>> +     asm volatile (
>> +             "mov    r0, %[pgd_ptr]\n\t"
>> +             "mov    r1, %[hyp_stack_ptr]\n\t"
>> +             "hvc    #0\n\t" : :
>> +             [pgd_ptr] "r" (pgd_ptr),
>> +             [hyp_stack_ptr] "r" (hyp_stack_ptr) :
>> +             "r0", "r1");
>> +}
>> +
>> +/**
>> + * Inits Hyp-mode on all online CPUs
>> + */
>> +static int init_hyp_mode(void)
>> +{
>> +     phys_addr_t init_phys_addr, init_end_phys_addr;
>> +     int cpu;
>> +     int err = 0;
>> +
>> +     /*
>> +      * Allocate stack pages for Hypervisor-mode
>> +      */
>> +     for_each_possible_cpu(cpu) {
>> +             unsigned long stack_page;
>> +
>> +             stack_page = __get_free_page(GFP_KERNEL);
>> +             if (!stack_page) {
>> +                     err = -ENOMEM;
>> +                     goto out_free_stack_pages;
>> +             }
>> +
>> +             per_cpu(kvm_arm_hyp_stack_page, cpu) = stack_page;
>> +     }
>> +
>> +     /*
>> +      * Allocate Hyp level-1 page table
>> +      */
>> +     err = kvm_hyp_pgd_alloc();
>> +     if (err)
>> +             goto out_free_stack_pages;
>> +
>> +     init_phys_addr = virt_to_phys(__kvm_hyp_init);
>> +     init_end_phys_addr = virt_to_phys(__kvm_hyp_init_end);
>> +     BUG_ON(init_phys_addr & 0x1f);
>> +
>> +     /*
>> +      * Create identity mapping for the init code.
>> +      */
>> +     hyp_idmap_add(kvm_hyp_pgd_get(),
>> +                   (unsigned long)init_phys_addr,
>> +                   (unsigned long)init_end_phys_addr);
>> +
>> +     /*
>> +      * Execute the init code on each CPU.
>> +      *
>> +      * Note: The stack is not mapped yet, so don't do anything else than
>> +      * initializing the hypervisor mode on each CPU using a local stack
>> +      * space for temporary storage.
>> +      */
>> +     for_each_online_cpu(cpu) {
>> +             smp_call_function_single(cpu, cpu_init_hyp_mode,
>> +                                      (void *)(long)init_phys_addr, 1);
>> +     }
>> +
>> +     /*
>> +      * Unmap the identity mapping
>> +      */
>> +     hyp_idmap_del(kvm_hyp_pgd_get(),
>> +                   (unsigned long)init_phys_addr,
>> +                   (unsigned long)init_end_phys_addr);
>> +
>> +     /*
>> +      * Map the Hyp-code called directly from the host
>> +      */
>> +     err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
>> +     if (err) {
>> +             kvm_err("Cannot map world-switch code\n");
>> +             goto out_free_mappings;
>> +     }
>> +
>> +     /*
>> +      * Map the Hyp stack pages
>> +      */
>> +     for_each_possible_cpu(cpu) {
>> +             char *stack_page = (char *)per_cpu(kvm_arm_hyp_stack_page, cpu);
>> +             err = create_hyp_mappings(stack_page, stack_page + PAGE_SIZE);
>> +
>> +             if (err) {
>> +                     kvm_err("Cannot map hyp stack\n");
>> +                     goto out_free_mappings;
>> +             }
>> +     }
>> +
>> +     /*
>> +      * Set the HVBAR to the virtual kernel address
>> +      */
>> +     for_each_online_cpu(cpu)
>> +             smp_call_function_single(cpu, cpu_set_vector,
>> +                                      __kvm_hyp_vector, 1);
>> +
>> +     return 0;
>> +out_free_mappings:
>> +     free_hyp_pmds();
>> +     kvm_hyp_pgd_free();
>> +out_free_stack_pages:
>> +     for_each_possible_cpu(cpu)
>> +             free_page(per_cpu(kvm_arm_hyp_stack_page, cpu));
>
> should assign per_cpu(kvm_arm_hyp_stack_page, cpu) to NULL.
>

why? this is run as part of the init code and thus the only way it
could ever run again would be to have the module unloaded in which
case the variable would be re-initialized to zero as per the static
declaration, no?

> Is there CPU hotplug support on ARM?
>

I don't think (read: hope) so. ARM people?

-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux