On Wed, Sep 09, 2020 at 10:05:27PM +0100, Marc Zyngier wrote: > KVM currently assumes that an instruction abort can never be a write. > This is in general true, except when the abort is triggered by > a S1PTW on instruction fetch that tries to update the S1 page tables > (to set AF, for example). > > This can happen if the page tables have been paged out and brought > back in without seeing a direct write to them (they are thus marked > read only), and the fault handling code will make the PT executable(!) > instead of writable. The guest gets stuck forever. > > In these conditions, the permission fault must be considered as > a write so that the Stage-1 update can take place. This is essentially > the I-side equivalent of the problem fixed by 60e21a0ef54c ("arm64: KVM: > Take S1 walks into account when determining S2 write faults"). > > Update both kvm_is_write_fault() to return true on IABT+S1PTW, as well > as kvm_vcpu_trap_is_iabt() to return false in the same conditions. > > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx> > --- > This could do with some cleanup (kvm_vcpu_dabt_iss1tw has nothing to do > with data aborts), but I've chosen to keep the patch simple in order to > ease backporting. > > arch/arm64/include/asm/kvm_emulate.h | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h > index d21676409a24..33d7e16edaa3 100644 > --- a/arch/arm64/include/asm/kvm_emulate.h > +++ b/arch/arm64/include/asm/kvm_emulate.h > @@ -480,7 +480,8 @@ static __always_inline u8 kvm_vcpu_trap_get_class(const struct kvm_vcpu *vcpu) > > static inline bool kvm_vcpu_trap_is_iabt(const struct kvm_vcpu *vcpu) > { > - return kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_IABT_LOW; > + return (kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_IABT_LOW && > + !kvm_vcpu_dabt_iss1tw(vcpu)); > } > > static __always_inline u8 kvm_vcpu_trap_get_fault(const struct kvm_vcpu *vcpu) > @@ -520,6 +521,9 @@ static __always_inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu) > > static inline bool kvm_is_write_fault(struct kvm_vcpu *vcpu) > { > + if (kvm_vcpu_dabt_iss1tw(vcpu)) > + return true; > + Hmm, I'm a bit uneasy about the interaction of this with kvm_handle_guest_abort() if we take an S1PTW fault on instruction fetch with our page-tables sitting in a read-only memslot. In this case, I think we'll end up injecting a data abort into the guest instead of an instruction abort. It hurts my brain thinking about it though. Overall, I'd be inclined to: 1. Rename kvm_vcpu_dabt_iss1tw() to kvm_vcpu_abt_iss1tw() 2. Introduce something like kvm_is_exec_fault() as: return kvm_vcpu_trap_is_iabt() && !kvm_vcpu_abt_iss1tw(); 3. Use that new function in user_mem_abort() to assign 'exec_fault' 4. Hack kvm_is_write_fault() as you have done above. Which I _think_ should work (famous last words)... The only nasty bit is that we then duplicate the kvm_vcpu_dabt_iss1tw() check in both kvm_is_write_fault() and kvm_vcpu_dabt_iswrite(). Perhaps we could remove the latter function in favour of the first? Anyway, obviously this sort of cleanup isn't for stable. Will