On Mon, Nov 12, 2012 at 12:02 PM, Marc Zyngier <marc.zyngier@xxxxxxx> wrote: > This patch fixes a very long standing bug in the KVM fault handling and > is basically a backport of the equivalent arm64 code. > > The ARM ARM is quite explicit in saying the HPFAR cannot be trusted > for a permission fault, unless it is a Stage-1 page table walk. > > The fact that it *seems* to work well enough on A15 is even more scary. > We may well be silently corrupting memory without even noticing (hint, > hint...). It is also quite possible that this bug would prevent KVM > from running on a CPU which caches TLBs covering VA to PA with a single > entry. > > The fix is to detect aborts that are permission faults, and not stage-1 > page table walk. We can then resolve the IPA, and store it just like the > HPFAR register. If the IPA cannot be resolved, it means another CPU is > playing with the page tables, and we simply restart the guest. > > This also paves the way for lighter Icache invalidation, using the NX bit. My main concern here is the complicated return code and this assembly code is not too easy to read and it duplicates the code in __kvm_va_to_pa. This is supposed to be a rare case, because if not, we're merging the wrong pages anyway, and the system performance will suck. Wouldn't it be nicer to check the fault type in C-code and call the __kvm_va_to_pa ? > > Signed-off-by: Marc Zyngier <marc.zyngier@xxxxxxx> > --- > >From v1: > - Use ATS1CPR instead of ATS1CPW > > arch/arm/kvm/interrupts.S | 37 ++++++++++++++++++++++++++++++++++++- > 1 file changed, 36 insertions(+), 1 deletion(-) > > diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S > index 5a09e89..b2d52a9 100644 > --- a/arch/arm/kvm/interrupts.S > +++ b/arch/arm/kvm/interrupts.S > @@ -401,12 +401,47 @@ guest_trap: > mrc p15, 4, r2, c6, c0, 0 @ HDFAR > > 2: str r2, [r1, #VCPU_HxFAR] > - mrc p15, 4, r2, c6, c0, 4 @ HPFAR > + > + /* > + * B3.13.5 Reporting exceptions taken to the Non-secure PL2 mode: > + * > + * Abort on the stage 2 translation for a memory access from a > + * Non-secure PL1 or PL0 mode: > + * > + * For any Access flag fault or Translation fault, and also for any > + * Permission fault on the stage 2 translation of a memory access > + * made as part of a translation table walk for a stage 1 translation, > + * the HPFAR holds the IPA that caused the fault. Otherwise, the HPFAR > + * is UNKNOWN. > + */ > + > + /* Check for permission fault, and S1PTW */ > + mrc p15, 4, r1, c5, c2, 0 @ HSR > + and r0, r1, #HSR_FSC_TYPE > + cmp r0, #FSC_PERM > + tsteq r1, #7 @ S1PTW > + mrcne p15, 4, r2, c6, c0, 4 @ HPFAR > + bne 3f > + > + /* Resolve IPA using the xFAR */ > + mcr p15, 0, r2, c7, c8, 0 @ ATS1CPR isn't there a race where guest user mode code writes some address which it can write, then we take the stage-2 fault, then another CPU changes the VA mapping to be accessible only in PL1, but user space code now gets a mapping to that memory which goes in the TLB and stays even after we return from handling the exception? > + isb > + mrrc p15, 0, r0, r1, c7 @ PAR > + tst r0, #1 > + bne 4f @ Failed translation > + ubfx r2, r0, #12, #20 > + lsl r2, r2, #4 > + orr r2, r2, r1, lsl #24 > + > +3: load_vcpu r1 > str r2, [r1, #VCPU_HPFAR] > > 1: mov r0, #ARM_EXCEPTION_HVC > b __kvm_vcpu_return > > +4: pop {r0, r1, r2} @ Failed translation, return to guest > + eret > + > /* > * If VFPv3 support is not available, then we will not switch the VFP > * registers; however cp10 and cp11 accesses will still trap and fallback > -- > 1.7.12 > > > > _______________________________________________ > kvmarm mailing list > kvmarm@xxxxxxxxxxxxxxxxxxxxx > https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm