Re: [PULL 2/4] arm64: KVM: Only force FPEXC32_EL2.EN if trapping FPSIMD

Christoffer Dall <christoffer.dall@xxxxxxx> · Wed, 5 Sep 2018 17:03:21 +0200

On Wed, Sep 05, 2018 at 12:28:46PM +0100, Dave Martin wrote:
> On Tue, Sep 04, 2018 at 02:51:21PM +0200, Christoffer Dall wrote:
> > On Tue, Sep 04, 2018 at 12:39:54PM +0100, Dave Martin wrote:
> > > On Mon, Sep 03, 2018 at 03:45:14PM +0200, Christoffer Dall wrote:
> > > > From: Marc Zyngier <marc.zyngier@xxxxxxx>
> > > > 
> > > > If trapping FPSIMD in the context of an AArch32 guest, it is critical
> > > > to set FPEXC32_EL2.EN to 1 so that the trapping is taken to EL2 and
> > > > not EL1.
> > > > 
> > > > Conversely, it is just as critical *not* to set FPEXC32_EL2.EN to 1
> > > > if we're not going to trap FPSIMD, as we then corrupt the existing
> > > > VFP state.
> > > > 
> > > > Moving the call to __activate_traps_fpsimd32 to the point where we
> > > > know for sure that we are going to trap ensures that we don't set that
> > > > bit spuriously.
> > > 
> > > 
> > > Hmmm, this looks like a viable fix but actually the way the code is
> > > structured is a bit confusing here.
> > > 
> > > It looks like some unnecessary functionality has been inherited from
> > > arch/arm here: for a 32-bit host, FPEXC controls FPSIMD trapping even
> > > at Hyp, so we need to handle it specially in order for the Hyp FPSIMD
> > > context switching code to run without generating additional traps.
> > > 
> > > For a 64-bit host, this doesn't apply: FPEXC32_EL2 is just guest
> > > context and doesn't affect EL2 execution.
> > 
> > Hmm, is that really true?  Earlier versions of the Armv8 Arm ARM, for
> > the description of CPTR_EL2.TFP:
> > 
> >   This causes instructions that access the registers associated with
> >   Floating Point and Advanced SIMD execution to trap to EL2 when
> >   executed from EL0, EL1, or EL2, unless trapped to EL1.
> > 
> > In the current version I think you can arrive at the same conclusion by
> > looking at the FPEXC_EL2.EN description which tells you that if EN==0,
> > then the instruction is UNDEFINED, and the Synchronous exception
> > prioritization rules specify that undefined instructions are taken
> > before trapped intructions because of CPTR_EL2.
> > 
> > But there does seem to be some inconsistency with the description of
> > generally trapping floating point access to EL2.
> 
> You can try taking a look at the archietctural pseudocode -- see the
> CheckFPAdvSIMDEnabled64() (for AArch64) and CheckFPAdvSIMDEnabled() (for
> AArch32) function hierarchies on developer.arm.com.
> 
> ARM DDI 0487B.b says for FPEXC32_EL2:
> 
> "Purpose:  Allows access to the AArch32 register FPEXC from AArch64
> state only. Its value has no effect on execution in AArch64 state."
> 
> This could arguably be worded better, but the second sentence is
> pretty unequivocal.
> 

I think I was confused about your point.  I thought you were arguing
that CPTR_EL2.TFP==1 + FPEXC_EL2.EN==0, and running in AArch32 EL1+EL0,
and accessing VFP at EL0 would trap to EL2, when in fact it traps to
EL1.  But now I think you're arguing that EL2 is unaffected by
FPEXC32_EL2, and I agree with that.

However, given the above, I don't understand how we can let the guest be
in control of FPEXC, if we want to make sure we trap VFP accesses?

I'll comment on the other patch.

Thanks,
     Christoffer