On 12/11/12 22:55, Christoffer Dall wrote: > On Mon, Nov 12, 2012 at 12:02 PM, Marc Zyngier <marc.zyngier@xxxxxxx> wrote: >> This patch fixes a very long standing bug in the KVM fault handling and >> is basically a backport of the equivalent arm64 code. >> >> The ARM ARM is quite explicit in saying the HPFAR cannot be trusted >> for a permission fault, unless it is a Stage-1 page table walk. >> >> The fact that it *seems* to work well enough on A15 is even more scary. >> We may well be silently corrupting memory without even noticing (hint, >> hint...). It is also quite possible that this bug would prevent KVM >> from running on a CPU which caches TLBs covering VA to PA with a single >> entry. >> >> The fix is to detect aborts that are permission faults, and not stage-1 >> page table walk. We can then resolve the IPA, and store it just like the >> HPFAR register. If the IPA cannot be resolved, it means another CPU is >> playing with the page tables, and we simply restart the guest. >> >> This also paves the way for lighter Icache invalidation, using the NX bit. > > My main concern here is the complicated return code and this assembly > code is not too easy to read and it duplicates the code in > __kvm_va_to_pa. This is supposed to be a rare case, because if not, > we're merging the wrong pages anyway, and the system performance will > suck. This is not exactly rare. There's a number of cases were we fault a page by reading from it, and then writing to it. A lot of static data in the kernel causes this, actually. > Wouldn't it be nicer to check the fault type in C-code and call the > __kvm_va_to_pa ? I don't find this nicer at all. Yes, it reuses a tiny bit of code, but the code path becomes very convoluted (fault, save guest, restore host, "oh wait, this was a permission fault", trap, save host, restore guest, resolve IPA, restore host, back to square one...) and a lot more expensive. Even if it wasn't performance critical, the above dance makes me think we'd be doing something wrong. >> >> Signed-off-by: Marc Zyngier <marc.zyngier@xxxxxxx> >> --- >> >From v1: >> - Use ATS1CPR instead of ATS1CPW >> >> arch/arm/kvm/interrupts.S | 37 ++++++++++++++++++++++++++++++++++++- >> 1 file changed, 36 insertions(+), 1 deletion(-) >> >> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S >> index 5a09e89..b2d52a9 100644 >> --- a/arch/arm/kvm/interrupts.S >> +++ b/arch/arm/kvm/interrupts.S >> @@ -401,12 +401,47 @@ guest_trap: >> mrc p15, 4, r2, c6, c0, 0 @ HDFAR >> >> 2: str r2, [r1, #VCPU_HxFAR] >> - mrc p15, 4, r2, c6, c0, 4 @ HPFAR >> + >> + /* >> + * B3.13.5 Reporting exceptions taken to the Non-secure PL2 mode: >> + * >> + * Abort on the stage 2 translation for a memory access from a >> + * Non-secure PL1 or PL0 mode: >> + * >> + * For any Access flag fault or Translation fault, and also for any >> + * Permission fault on the stage 2 translation of a memory access >> + * made as part of a translation table walk for a stage 1 translation, >> + * the HPFAR holds the IPA that caused the fault. Otherwise, the HPFAR >> + * is UNKNOWN. >> + */ >> + >> + /* Check for permission fault, and S1PTW */ >> + mrc p15, 4, r1, c5, c2, 0 @ HSR >> + and r0, r1, #HSR_FSC_TYPE >> + cmp r0, #FSC_PERM >> + tsteq r1, #7 @ S1PTW >> + mrcne p15, 4, r2, c6, c0, 4 @ HPFAR >> + bne 3f >> + >> + /* Resolve IPA using the xFAR */ >> + mcr p15, 0, r2, c7, c8, 0 @ ATS1CPR > > isn't there a race where guest user mode code writes some address > which it can write, then we take the stage-2 fault, then another CPU can't? > changes the VA mapping to be accessible only in PL1, but user space > code now gets a mapping to that memory which goes in the TLB and stays > even after we return from handling the exception? It really doesn't matter. We do not grant extra permission to the pages with Stage-2. When we resume the guest after handling the permission fault (granting write access), it will evaluate the translation again, and fault in the guest this time. M. -- Jazz is not dead. It just smells funny... _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm