[Android-virt] [PATCH] ARM: KVM: Monitor-Hypervisor API

cdall at cs.columbia.edu (Christoffer Dall) · Wed, 11 May 2011 14:54:20 +0200

On Tue, May 10, 2011 at 3:29 PM, Catalin Marinas
<catalin.marinas at arm.com> wrote:
> Hi Christoffer,
>
> Just some thoughts below, I haven't done a full review. But I like the
> simple SMC API for just setting up the HVBAR and than handling it in the
> KVM start-up code.

Thanks, it's much appreciated. See my responses below.

>
> On Sun, 2011-05-08 at 15:00 +0100, Christoffer Dall wrote:
>> --- /dev/null
>> +++ b/arch/arm/include/asm/kvm.h
> ...
>> +/*
>> + * Change the current processor mode to Hyp mode.
>> + * You should never return to user space or enable interrupts before calling
>> + * kvm_arm_hyp_return.
>> + */
>> +static inline void kvm_arm_hyp_mode(void)
>> +{
>> + ? ? ? __asm__ (
>> + ? ? ? ? ? ? ? "push ? {lr}\n\t"
>> + ? ? ? ? ? ? ? "hvc ? ?#0\n\t"
>> + ? ? ? ? ? ? ? "pop ? ?{lr}\n\t");
>> +}
>> +
>> +/*
>> + * Return from Hyp mode to Svc mode.
>> + */
>> +static inline void kvm_arm_hyp_return(void)
>> +{
>> + ? ? ? __asm__ (
>> + ? ? ? ? ? ? ? "push ? {lr}\n\t"
>> + ? ? ? ? ? ? ? "hvc ? ?#0\n\t"
>> + ? ? ? ? ? ? ? "pop ? ?{lr}\n\t");
>> +}
>
> I don't fully understand what these functions do. Do you switch the
> current kernel code to the Hyp mode?

Yes, that was the thought. But they were experimental and should
probably not have been a part of this patch where the focus was the
init code. The idea, in short, was to be able to do things like this
in KVM code:

kvm_arm_hyp_mode();
asm("mrc     ...... " : ...); // Some Hyp state
kvm_arm_hyp_return();

But, I think it will be just as fast to capture/set the necessary
state on world-switches.

Anyway, I removed them for now, and I will send an updated patch soon.

>
> I also don't know why you push/pop LR. In theory, the Hyp code handling
> the HVC should preserve it. The Hyp mode using the user LR rather than
> the SVC LR anyway.
>
well, since it returned in Hyp mode, I would have had to copy the svc
lr to the hyp (usr) lr in the hvc handler, which I left up for the
caller. Anyway, it's gone for now.

>> --- /dev/null
>> +++ b/arch/arm/kvm/arm.c
> ...
>> +int kvm_arch_hardware_enable(void *garbage)
>> +{
>> + ? ? ? unsigned long vector_ptr, hyp_stack_ptr;
>> + ? ? ? unsigned long init_ptr, init_end_ptr, phys_addr;
>> + ? ? ? phys_addr_t init_phys_addr;
>> + ? ? ? u64 pfn;
>> + ? ? ? pgprot_t prot;
>> +
>> + ? ? ? pgd_t *pgd;
>> + ? ? ? pud_t *pud;
>> + ? ? ? pmd_t *pmd;
>> + ? ? ? pte_t *pte;
>> +
>> + ? ? ? if (kvm_arm_hardware_enabled)
>> + ? ? ? ? ? ? ? return 0;
>> +
>> + ? ? ? /*
>> + ? ? ? ?* Allocate stack page for Hypervisor-mode
>> + ? ? ? ?*/
>> + ? ? ? kvm_arm_hyp_stack_page = (void *)__get_free_page(GFP_KERNEL);
>> + ? ? ? if (!kvm_arm_hyp_stack_page)
>> + ? ? ? ? ? ? ? return -ENOMEM;
>> +
>> + ? ? ? init_ptr = (unsigned long)&__kvm_hyp_init;
>> + ? ? ? init_end_ptr = (unsigned long)&__kvm_hyp_init_end;
>> + ? ? ? init_phys_addr = virt_to_phys((void *)&__kvm_hyp_init);
>> + ? ? ? if (init_phys_addr > (unsigned long long)~0) {
>
> virt_to_phys always returns addresses within the 32-bit range (lowmem).
>
OK.

That leads me to another question. For the stage-2 translation, should
we start at level 2 and only support 32-bit addressing for the stage-2
translation or do you think it's worth supporint 40-bit physical
addresses in all three levels from the beginning? I would prefer to
avoid supporting both options for now, unless there's a really strong
performance and use case argument?

>> + ? ? ? ? ? ? ? kvm_err(-ENOTSUPP, "Hyp init physical address must be 32-bit\n");
>> + ? ? ? ? ? ? ? return -ENOTSUPP;
>> + ? ? ? }
>> + ? ? ? phys_addr = (unsigned long)init_phys_addr;
>> +
>> + ? ? ? if (init_end_ptr - init_ptr > PAGE_SIZE) {
>> + ? ? ? ? ? ? ? kvm_err(-ENOTSUPP, "KVM init code may not exceed 1 page\n");
>> + ? ? ? ? ? ? ? return -ENOTSUPP;
>> + ? ? ? }
>
> If you use sections (PMD) you can go to 2MB.
>
>> + ? ? ? pgd = pgd_offset_k(phys_addr);
>
> Are you trying to create an identity mapping? I would suggest you use
> identity_mapping_add() directly.

Yes, I am trying that. And that sounds like a good idea.
>
> Are the page mappings here intended to be used by the Hyp code? Do you
> need CONFIG_ARM_LPAE enabled?
>
Yes, intended to be used by Hyp code. CONFIG_KVM relies on CONFIG_ARM_LPAE...

>> + ? ? ? pud = pud_alloc(&init_mm, pgd, phys_addr);
>> + ? ? ? if (!pud)
>> + ? ? ? ? ? ? ? return -ENOMEM;
>> + ? ? ? pmd = pmd_alloc(&init_mm, pud, phys_addr);
>> + ? ? ? if (!pmd)
>> + ? ? ? ? ? ? ? return -ENOMEM;
>> + ? ? ? pte = pte_alloc_kernel(pmd, phys_addr);
>> + ? ? ? if (!pte)
>> + ? ? ? ? ? ? ? return -ENOMEM;
>> + ? ? ? BUG_ON(!pte_none(*pte));
>
> The error paths here should free the previous allocations in reverse
> order. That's why I think using identity_mapping_(add|del) is easier.
>

That's true, stupid me.
But since we're going to use the identity_mapping, it will go away.

>> + ? ? ? asm volatile (
>> + ? ? ? ? ? ? ? "mov ? ?r0, %[vector_ptr]\n\t"
>> + ? ? ? ? ? ? ? "ldr ? ?r7, =SMCHYP_HVBAR_W\n\t"
>> + ? ? ? ? ? ? ? "smc ? ?#0\n\t" :
>> + ? ? ? ? ? ? ? : [vector_ptr] "r" ((unsigned long)init_phys_addr)
>> + ? ? ? ? ? ? ? : "r0", "r7", "r12");
>
> Why do you mark r12 as clobbered? It doesn't seem to be used inside the
> asm statement (and you can't guarantee that vector_ptr will be allocated
> in r12 by the compiler).
>

I was avoiding any use of stack space in the monitor code (and the
monitor call clobbered r12) for whatever reason. I changed that now to
callee-save using a small stack space in the monitor.S code.

>> + ? ? ? /*
>> + ? ? ? ?* Unmap the identity mapping
>> + ? ? ? ?*/
>> + ? ? ? pmd_clear(pmd_off_k(phys_addr));
>
> Memory leak. The above allocations should be freed.
>

Ack. Will be done with identity_mapping_del.

>> --- /dev/null
>> +++ b/arch/arm/kvm/arm_interrupts.S
>> +@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
>> +@ ?Hypervisor initialization
>> +@ ? ?- should be called with:
>> +@ ? ? ? ?r0 = top of Hyp stack (VA)
>> +@ ? ? ? ?r1 = virtual HVBAR address
>> +@ ? ?- caller must preserve r0, r1, r12
>> +@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
>> + ? ? ? .text
>> + ? ? ? .align 5
>> +__kvm_hyp_init:
>> + ? ? ? .globl __kvm_hyp_init
>> +
>> + ? ? ? @ Hyp-mode exception vector
>> + ? ? ? nop
>> + ? ? ? nop
>> + ? ? ? nop
>> + ? ? ? nop
>> + ? ? ? nop
>> + ? ? ? b ? ? ? __do_hyp_init
>> + ? ? ? nop
>> + ? ? ? nop
>> +
>> +__do_hyp_init:
>> + ? ? ? @ Copy the Hyp stack pointer
>> + ? ? ? mov ? ? sp, r0
>> + ? ? ? mov ? ? lr, r1
>> +
>> + ? ? ? @ Set the HTTBR to be the same as the TTBR1 holding the kernel
>> + ? ? ? @ level-1 page table
>> + ? ? ? mrrc ? ?p15, 1, r0, r1, c2
>> + ? ? ? mcrr ? ?p15, 4, r0, r1, c2
>> +
>> + ? ? ? @ Set the HTCR to the same shareability and cacheability settings as the
>> + ? ? ? @ non-secure TTBCR and with T0SZ == 0.
>> + ? ? ? mrc ? ? p15, 4, r0, c2, c0, 2
>> + ? ? ? ldr ? ? r12, =HTCR_MASK
>> + ? ? ? bic ? ? r0, r0, r12
>> + ? ? ? mrc ? ? p15, 0, r1, c2, c0, 2
>> + ? ? ? and ? ? r1, r1, #(HTCR_MASK & ~TTBCR_T0SZ)
>> + ? ? ? orr ? ? r0, r0, r1
>> + ? ? ? mcr ? ? p15, 4, r0, c2, c0, 2
>> +
>> + ? ? ? @ Use the same memory attributes for hyp. accesses as the kernel
>> + ? ? ? @ (copy MAIRx ro HMAIRx).
>> + ? ? ? mrc ? ? p15, 0, r0, c10, c2, 0
>> + ? ? ? mcr ? ? p15, 4, r0, c10, c2, 0
>> + ? ? ? mrc ? ? p15, 0, r0, c10, c2, 1
>> + ? ? ? mcr ? ? p15, 4, r0, c10, c2, 1
>> +
>> + ? ? ? @ Set the HSCTLR to:
>> + ? ? ? @ ?- ARM/THUMB exceptions: Kernel config
>> + ? ? ? @ ?- Endianness: Kernel config
>> + ? ? ? @ ?- Fast Interrupt Features: Kernel config
>> + ? ? ? @ ?- Write permission implies XN: disabled
>> + ? ? ? @ ?- Instruction cache: enabled
>> + ? ? ? @ ?- Data/Unified cache: enabled
>> + ? ? ? @ ?- Memory alignment checks: enabled
>> + ? ? ? @ ?- MMU: enabled (this code must be run from an identity mapping)
>> + ? ? ? mrc ? ? p15, 4, r0, c1, c0, 0
>> + ? ? ? ldr ? ? r12, =HSCTLR_MASK
>> + ? ? ? bic ? ? r0, r0, r12
>> + ? ? ? mrc ? ? p15, 0, r1, c1, c0, 0
>> + ? ? ? ldr ? ? r12, =(HSCTLR_TE | HSCTLR_EE | HSCTLR_FI)
>> + ? ? ? and ? ? r1, r1, r12
>> + ? ? ? ldr ? ? r12, =(HSCTLR_M | HSCTLR_A | HSCTLR_I)
>> + ? ? ? orr ? ? r1, r1, r12
>> + ? ? ? orr ? ? r0, r0, r1
>> + ? ? ? mcr ? ? p15, 4, r0, c1, c0, 0
>
> You need an ISB before the above MCR to make sure that the above
> register setting took place before enabling the Hyp MMU. You also need
> an ISB after so that you know that the MMU has been enabled (for such
> barriers, I think the model has an option -
> cluster.delayed_CP15_operations - that can be set to 1).
>

Thanks!
(Although I couldn't find such an option on the model I have. The
purpose would be to discover potential bugs right?)

>> +@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
>> +@ ?Hypervisor exception vector and handlers
>> +@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
>> +.macro hyp_entry
>> + ? ? ? push ? ?{r0, r1, r2}
>> + ? ? ? mrc ? ? p15, 4, r0, c5, c2, 0 ? @HSR
>> + ? ? ? lsr ? ? r1, r0, #26
>> + ? ? ? cmp ? ? r1, #0x12 ? ? ? ? ? ? ? @HVC
>> + ? ? ? bne ? ? 1f
>> +
>> + ? ? ? @ Let's check if the HVC came from VMID 0 and allow simple
>> + ? ? ? @ switch to Hyp mode
>> + ? ? ? mrrc ? ?p15, 6, r1, r2, c2
>> + ? ? ? lsr ? ? r2, r2, #16
>> + ? ? ? and ? ? r2, r2, #0xff
>> + ? ? ? cmp ? ? r2, #0
>> + ? ? ? bne ? ? 1f ? ? ? ? ? ? ? ? ? ? ?@ VMID != 0
>> +.endm
>> +
>> +.macro hyp_return
>> + ? ? ? eret
>> +.endm
>
> Do you not need to save/restore more registers in these macros?

I restored them when the macro was used, but the macros are gone now
anyway - this code was also experimental and I basically just
implemented it to test that the KVM init stuff was working.

For the updated changes, I made several commits to address your
concerns and restructured much of the code. Would you like to see a
new full patch for KVM or specific changes?

Thanks,
Christoffer