Re: [PATCH v4 3/3] arm64: Add workaround for Arm Cortex-A77 erratum 1508412

Rob Herring <robh@xxxxxxxxxx> · Wed, 9 Sep 2020 17:06:28 -0600

On Fri, Aug 21, 2020 at 11:51 AM Catalin Marinas
<catalin.marinas@xxxxxxx> wrote:
>
> On Fri, Aug 21, 2020 at 06:02:39PM +0100, Marc Zyngier wrote:
> > On 2020-08-21 15:05, Catalin Marinas wrote:
> > > On Fri, Aug 21, 2020 at 01:45:40PM +0100, Marc Zyngier wrote:
> > > > On 2020-08-21 13:26, Catalin Marinas wrote:
> > > > > On Fri, Aug 21, 2020 at 01:12:10PM +0100, Will Deacon wrote:
> > > > > > On Fri, Aug 21, 2020 at 01:07:00PM +0100, Catalin Marinas wrote:
> > > > > > > On Mon, Aug 03, 2020 at 01:31:27PM -0600, Rob Herring wrote:
> > > > > > > > @@ -979,6 +980,14 @@
> > > > > > > >           write_sysreg(__scs_new, sysreg);                        \
> > > > > > > >  } while (0)
> > > > > > > >
> > > > > > > > +#define read_sysreg_par() ({                                             \
> > > > > > > > + u64 par;                                                        \
> > > > > > > > + asm(ALTERNATIVE("nop", "dmb sy", ARM64_WORKAROUND_1508412));    \
> > > > > > > > + par = read_sysreg(par_el1);                                     \
> > > > > > > > + asm(ALTERNATIVE("nop", "dmb sy", ARM64_WORKAROUND_1508412));    \
> > > > > > > > + par;                                                            \
> > > > > > > > +})
> > > > > > >
> > > > > > > I was about to queue this up but one more point to clarify: can we get
> > > > > > > an interrupt at either side of the PAR_EL1 read and the handler do a
> > > > > > > device read, triggering the erratum? Do we need a DMB at exception
> > > > > > > entry/return?
> > > > > >
> > > > > > Disabling irqs around the PAR access would be simpler, I think
> > > > > > (assuming
> > > > > > this is needed).
> > > > >
> > > > > This wouldn't work if it interrupts a guest.
> > > >
> > > > If we take an interrupt either side of the PAR_EL1 read and that we
> > > > fully exit, the saving of PAR_EL1 on the way out solves the problem.
> > > >
> > > > If we don't fully exit, but instead reenter the guest immediately
> > > > (fixup_guest_exit() returns true), we'd need a DMB at that point,
> > > > at least because of the GICv2 proxying code which performs device
> > > > accesses on the guest's behalf.
> > >
> > > If you are ok with the diff below, I can fold it in:
> > >
> > > diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h
> > > b/arch/arm64/kvm/hyp/include/hyp/switch.h
> > > index ca88ea416176..8770cf7ccd42 100644
> > > --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> > > +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> > > @@ -420,7 +420,7 @@ static inline bool fixup_guest_exit(struct
> > > kvm_vcpu *vcpu, u64 *exit_code)
> > >     if (cpus_have_final_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM) &&
> > >         kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SYS64 &&
> > >         handle_tx2_tvm(vcpu))
> > > -           return true;
> > > +           goto guest;
> > >
> > >     /*
> > >      * We trap the first access to the FP/SIMD to save the host context
> > > @@ -430,13 +430,13 @@ static inline bool fixup_guest_exit(struct
> > > kvm_vcpu *vcpu, u64 *exit_code)
> > >      * Similarly for trapped SVE accesses.
> > >      */
> > >     if (__hyp_handle_fpsimd(vcpu))
> > > -           return true;
> > > +           goto guest;
> > >
> > >     if (__hyp_handle_ptrauth(vcpu))
> > > -           return true;
> > > +           goto guest;
> > >
> > >     if (!__populate_fault_info(vcpu))
> > > -           return true;
> > > +           goto guest;
> > >
> > >     if (static_branch_unlikely(&vgic_v2_cpuif_trap)) {
> > >             bool valid;
> > > @@ -451,7 +451,7 @@ static inline bool fixup_guest_exit(struct
> > > kvm_vcpu *vcpu, u64 *exit_code)
> > >                     int ret = __vgic_v2_perform_cpuif_access(vcpu);
> > >
> > >                     if (ret == 1)
> > > -                           return true;
> > > +                           goto guest;
> > >
> > >                     /* Promote an illegal access to an SError.*/
> > >                     if (ret == -1)
> > > @@ -467,12 +467,17 @@ static inline bool fixup_guest_exit(struct
> > > kvm_vcpu *vcpu, u64 *exit_code)
> > >             int ret = __vgic_v3_perform_cpuif_access(vcpu);
> > >
> > >             if (ret == 1)
> > > -                   return true;
> > > +                   goto guest;
> > >     }
> > >
> > >  exit:
> > >     /* Return to the host kernel and handle the exit */
> > >     return false;
> > > +
> > > +guest:
> > > +   /* Re-enter the guest */
> > > +   asm(ALTERNATIVE("nop", "dmb sy", ARM64_WORKAROUND_1508412));
> > > +   return true;
> > >  }
> > >
> > >  static inline bool __needs_ssbd_off(struct kvm_vcpu *vcpu)
> >
> > Looks good to me!
>
> Thanks Marc. Since it needs the local_irq_save() around the PAR_EL1
> access in read_sysreg_par(), I'll wait for Rob to update the patches.
> Rob also asked the hardware guys for clarification on this scenario, so
> let's see what they reply.

According to the h/w folks, an interrupt after the PAR read is not an
issue, but an interrupt doing a device read between the 1st DMB and
the PAR read would be an issue. So v5 coming your way.

Rob
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm