On Tue, May 10, 2011 at 3:29 PM, Catalin Marinas <catalin.marinas at arm.com> wrote: > Hi Christoffer, > > Just some thoughts below, I haven't done a full review. But I like the > simple SMC API for just setting up the HVBAR and than handling it in the > KVM start-up code. Thanks, it's much appreciated. See my responses below. > > On Sun, 2011-05-08 at 15:00 +0100, Christoffer Dall wrote: >> --- /dev/null >> +++ b/arch/arm/include/asm/kvm.h > ... >> +/* >> + * Change the current processor mode to Hyp mode. >> + * You should never return to user space or enable interrupts before calling >> + * kvm_arm_hyp_return. >> + */ >> +static inline void kvm_arm_hyp_mode(void) >> +{ >> + ? ? ? __asm__ ( >> + ? ? ? ? ? ? ? "push ? {lr}\n\t" >> + ? ? ? ? ? ? ? "hvc ? ?#0\n\t" >> + ? ? ? ? ? ? ? "pop ? ?{lr}\n\t"); >> +} >> + >> +/* >> + * Return from Hyp mode to Svc mode. >> + */ >> +static inline void kvm_arm_hyp_return(void) >> +{ >> + ? ? ? __asm__ ( >> + ? ? ? ? ? ? ? "push ? {lr}\n\t" >> + ? ? ? ? ? ? ? "hvc ? ?#0\n\t" >> + ? ? ? ? ? ? ? "pop ? ?{lr}\n\t"); >> +} > > I don't fully understand what these functions do. Do you switch the > current kernel code to the Hyp mode? Yes, that was the thought. But they were experimental and should probably not have been a part of this patch where the focus was the init code. The idea, in short, was to be able to do things like this in KVM code: kvm_arm_hyp_mode(); asm("mrc ...... " : ...); // Some Hyp state kvm_arm_hyp_return(); But, I think it will be just as fast to capture/set the necessary state on world-switches. Anyway, I removed them for now, and I will send an updated patch soon. > > I also don't know why you push/pop LR. In theory, the Hyp code handling > the HVC should preserve it. The Hyp mode using the user LR rather than > the SVC LR anyway. > well, since it returned in Hyp mode, I would have had to copy the svc lr to the hyp (usr) lr in the hvc handler, which I left up for the caller. Anyway, it's gone for now. >> --- /dev/null >> +++ b/arch/arm/kvm/arm.c > ... >> +int kvm_arch_hardware_enable(void *garbage) >> +{ >> + ? ? ? unsigned long vector_ptr, hyp_stack_ptr; >> + ? ? ? unsigned long init_ptr, init_end_ptr, phys_addr; >> + ? ? ? phys_addr_t init_phys_addr; >> + ? ? ? u64 pfn; >> + ? ? ? pgprot_t prot; >> + >> + ? ? ? pgd_t *pgd; >> + ? ? ? pud_t *pud; >> + ? ? ? pmd_t *pmd; >> + ? ? ? pte_t *pte; >> + >> + ? ? ? if (kvm_arm_hardware_enabled) >> + ? ? ? ? ? ? ? return 0; >> + >> + ? ? ? /* >> + ? ? ? ?* Allocate stack page for Hypervisor-mode >> + ? ? ? ?*/ >> + ? ? ? kvm_arm_hyp_stack_page = (void *)__get_free_page(GFP_KERNEL); >> + ? ? ? if (!kvm_arm_hyp_stack_page) >> + ? ? ? ? ? ? ? return -ENOMEM; >> + >> + ? ? ? init_ptr = (unsigned long)&__kvm_hyp_init; >> + ? ? ? init_end_ptr = (unsigned long)&__kvm_hyp_init_end; >> + ? ? ? init_phys_addr = virt_to_phys((void *)&__kvm_hyp_init); >> + ? ? ? if (init_phys_addr > (unsigned long long)~0) { > > virt_to_phys always returns addresses within the 32-bit range (lowmem). > OK. That leads me to another question. For the stage-2 translation, should we start at level 2 and only support 32-bit addressing for the stage-2 translation or do you think it's worth supporint 40-bit physical addresses in all three levels from the beginning? I would prefer to avoid supporting both options for now, unless there's a really strong performance and use case argument? >> + ? ? ? ? ? ? ? kvm_err(-ENOTSUPP, "Hyp init physical address must be 32-bit\n"); >> + ? ? ? ? ? ? ? return -ENOTSUPP; >> + ? ? ? } >> + ? ? ? phys_addr = (unsigned long)init_phys_addr; >> + >> + ? ? ? if (init_end_ptr - init_ptr > PAGE_SIZE) { >> + ? ? ? ? ? ? ? kvm_err(-ENOTSUPP, "KVM init code may not exceed 1 page\n"); >> + ? ? ? ? ? ? ? return -ENOTSUPP; >> + ? ? ? } > > If you use sections (PMD) you can go to 2MB. > >> + ? ? ? pgd = pgd_offset_k(phys_addr); > > Are you trying to create an identity mapping? I would suggest you use > identity_mapping_add() directly. Yes, I am trying that. And that sounds like a good idea. > > Are the page mappings here intended to be used by the Hyp code? Do you > need CONFIG_ARM_LPAE enabled? > Yes, intended to be used by Hyp code. CONFIG_KVM relies on CONFIG_ARM_LPAE... >> + ? ? ? pud = pud_alloc(&init_mm, pgd, phys_addr); >> + ? ? ? if (!pud) >> + ? ? ? ? ? ? ? return -ENOMEM; >> + ? ? ? pmd = pmd_alloc(&init_mm, pud, phys_addr); >> + ? ? ? if (!pmd) >> + ? ? ? ? ? ? ? return -ENOMEM; >> + ? ? ? pte = pte_alloc_kernel(pmd, phys_addr); >> + ? ? ? if (!pte) >> + ? ? ? ? ? ? ? return -ENOMEM; >> + ? ? ? BUG_ON(!pte_none(*pte)); > > The error paths here should free the previous allocations in reverse > order. That's why I think using identity_mapping_(add|del) is easier. > That's true, stupid me. But since we're going to use the identity_mapping, it will go away. >> + ? ? ? asm volatile ( >> + ? ? ? ? ? ? ? "mov ? ?r0, %[vector_ptr]\n\t" >> + ? ? ? ? ? ? ? "ldr ? ?r7, =SMCHYP_HVBAR_W\n\t" >> + ? ? ? ? ? ? ? "smc ? ?#0\n\t" : >> + ? ? ? ? ? ? ? : [vector_ptr] "r" ((unsigned long)init_phys_addr) >> + ? ? ? ? ? ? ? : "r0", "r7", "r12"); > > Why do you mark r12 as clobbered? It doesn't seem to be used inside the > asm statement (and you can't guarantee that vector_ptr will be allocated > in r12 by the compiler). > I was avoiding any use of stack space in the monitor code (and the monitor call clobbered r12) for whatever reason. I changed that now to callee-save using a small stack space in the monitor.S code. >> + ? ? ? /* >> + ? ? ? ?* Unmap the identity mapping >> + ? ? ? ?*/ >> + ? ? ? pmd_clear(pmd_off_k(phys_addr)); > > Memory leak. The above allocations should be freed. > Ack. Will be done with identity_mapping_del. >> --- /dev/null >> +++ b/arch/arm/kvm/arm_interrupts.S >> +@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ >> +@ ?Hypervisor initialization >> +@ ? ?- should be called with: >> +@ ? ? ? ?r0 = top of Hyp stack (VA) >> +@ ? ? ? ?r1 = virtual HVBAR address >> +@ ? ?- caller must preserve r0, r1, r12 >> +@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ >> + ? ? ? .text >> + ? ? ? .align 5 >> +__kvm_hyp_init: >> + ? ? ? .globl __kvm_hyp_init >> + >> + ? ? ? @ Hyp-mode exception vector >> + ? ? ? nop >> + ? ? ? nop >> + ? ? ? nop >> + ? ? ? nop >> + ? ? ? nop >> + ? ? ? b ? ? ? __do_hyp_init >> + ? ? ? nop >> + ? ? ? nop >> + >> +__do_hyp_init: >> + ? ? ? @ Copy the Hyp stack pointer >> + ? ? ? mov ? ? sp, r0 >> + ? ? ? mov ? ? lr, r1 >> + >> + ? ? ? @ Set the HTTBR to be the same as the TTBR1 holding the kernel >> + ? ? ? @ level-1 page table >> + ? ? ? mrrc ? ?p15, 1, r0, r1, c2 >> + ? ? ? mcrr ? ?p15, 4, r0, r1, c2 >> + >> + ? ? ? @ Set the HTCR to the same shareability and cacheability settings as the >> + ? ? ? @ non-secure TTBCR and with T0SZ == 0. >> + ? ? ? mrc ? ? p15, 4, r0, c2, c0, 2 >> + ? ? ? ldr ? ? r12, =HTCR_MASK >> + ? ? ? bic ? ? r0, r0, r12 >> + ? ? ? mrc ? ? p15, 0, r1, c2, c0, 2 >> + ? ? ? and ? ? r1, r1, #(HTCR_MASK & ~TTBCR_T0SZ) >> + ? ? ? orr ? ? r0, r0, r1 >> + ? ? ? mcr ? ? p15, 4, r0, c2, c0, 2 >> + >> + ? ? ? @ Use the same memory attributes for hyp. accesses as the kernel >> + ? ? ? @ (copy MAIRx ro HMAIRx). >> + ? ? ? mrc ? ? p15, 0, r0, c10, c2, 0 >> + ? ? ? mcr ? ? p15, 4, r0, c10, c2, 0 >> + ? ? ? mrc ? ? p15, 0, r0, c10, c2, 1 >> + ? ? ? mcr ? ? p15, 4, r0, c10, c2, 1 >> + >> + ? ? ? @ Set the HSCTLR to: >> + ? ? ? @ ?- ARM/THUMB exceptions: Kernel config >> + ? ? ? @ ?- Endianness: Kernel config >> + ? ? ? @ ?- Fast Interrupt Features: Kernel config >> + ? ? ? @ ?- Write permission implies XN: disabled >> + ? ? ? @ ?- Instruction cache: enabled >> + ? ? ? @ ?- Data/Unified cache: enabled >> + ? ? ? @ ?- Memory alignment checks: enabled >> + ? ? ? @ ?- MMU: enabled (this code must be run from an identity mapping) >> + ? ? ? mrc ? ? p15, 4, r0, c1, c0, 0 >> + ? ? ? ldr ? ? r12, =HSCTLR_MASK >> + ? ? ? bic ? ? r0, r0, r12 >> + ? ? ? mrc ? ? p15, 0, r1, c1, c0, 0 >> + ? ? ? ldr ? ? r12, =(HSCTLR_TE | HSCTLR_EE | HSCTLR_FI) >> + ? ? ? and ? ? r1, r1, r12 >> + ? ? ? ldr ? ? r12, =(HSCTLR_M | HSCTLR_A | HSCTLR_I) >> + ? ? ? orr ? ? r1, r1, r12 >> + ? ? ? orr ? ? r0, r0, r1 >> + ? ? ? mcr ? ? p15, 4, r0, c1, c0, 0 > > You need an ISB before the above MCR to make sure that the above > register setting took place before enabling the Hyp MMU. You also need > an ISB after so that you know that the MMU has been enabled (for such > barriers, I think the model has an option - > cluster.delayed_CP15_operations - that can be set to 1). > Thanks! (Although I couldn't find such an option on the model I have. The purpose would be to discover potential bugs right?) >> +@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ >> +@ ?Hypervisor exception vector and handlers >> +@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ >> +.macro hyp_entry >> + ? ? ? push ? ?{r0, r1, r2} >> + ? ? ? mrc ? ? p15, 4, r0, c5, c2, 0 ? @HSR >> + ? ? ? lsr ? ? r1, r0, #26 >> + ? ? ? cmp ? ? r1, #0x12 ? ? ? ? ? ? ? @HVC >> + ? ? ? bne ? ? 1f >> + >> + ? ? ? @ Let's check if the HVC came from VMID 0 and allow simple >> + ? ? ? @ switch to Hyp mode >> + ? ? ? mrrc ? ?p15, 6, r1, r2, c2 >> + ? ? ? lsr ? ? r2, r2, #16 >> + ? ? ? and ? ? r2, r2, #0xff >> + ? ? ? cmp ? ? r2, #0 >> + ? ? ? bne ? ? 1f ? ? ? ? ? ? ? ? ? ? ?@ VMID != 0 >> +.endm >> + >> +.macro hyp_return >> + ? ? ? eret >> +.endm > > Do you not need to save/restore more registers in these macros? I restored them when the macro was used, but the macros are gone now anyway - this code was also experimental and I basically just implemented it to test that the KVM init stuff was working. For the updated changes, I made several commits to address your concerns and restructured much of the code. Would you like to see a new full patch for KVM or specific changes? Thanks, Christoffer