On Sat, Dec 21, 2019 at 02:13:25PM +0000, Marc Zyngier wrote:
> On Fri, 20 Dec 2019 14:30:16 +0000
> Andrew Murray <andrew.murray@xxxxxxx> wrote:
>
> [somehow managed not to do a reply all, re-sending]
>
> > From: Sudeep Holla <sudeep.holla@xxxxxxx>
> >
> > Now that we can save/restore the full SPE controls, we can enable it
> > if SPE is setup and ready to use in KVM. It's supported in KVM only if
> > all the CPUs in the system supports SPE.
> >
> > However to support heterogenous systems, we need to move the check if
> > host supports SPE and do a partial save/restore.
>
> No. Let's just not go down that path. For now, KVM on heterogeneous
> systems do not get SPE. If SPE has been enabled on a guest and a CPU
> comes up without SPE, this CPU should fail to boot (same as exposing a
> feature to userspace).
>
> >
> > Signed-off-by: Sudeep Holla <sudeep.holla@xxxxxxx>
> > Signed-off-by: Andrew Murray <andrew.murray@xxxxxxx>
> > ---
> > arch/arm64/kvm/hyp/debug-sr.c | 33 ++++++++++++++++-----------------
> > include/kvm/arm_spe.h | 6 ++++++
> > 2 files changed, 22 insertions(+), 17 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/hyp/debug-sr.c b/arch/arm64/kvm/hyp/debug-sr.c
> > index 12429b212a3a..d8d857067e6d 100644
> > --- a/arch/arm64/kvm/hyp/debug-sr.c
> > +++ b/arch/arm64/kvm/hyp/debug-sr.c
> > @@ -86,18 +86,13 @@
> > }
> >
> > static void __hyp_text
> > -__debug_save_spe_nvhe(struct kvm_cpu_context *ctxt, bool full_ctxt)
> > +__debug_save_spe_context(struct kvm_cpu_context *ctxt, bool full_ctxt)
> > {
> > u64 reg;
> >
> > /* Clear pmscr in case of early return */
> > ctxt->sys_regs[PMSCR_EL1] = 0;
> >
> > - /* SPE present on this CPU? */
> > - if (!cpuid_feature_extract_unsigned_field(read_sysreg(id_aa64dfr0_el1),
> > - ID_AA64DFR0_PMSVER_SHIFT))
> > - return;
> > -
> > /* Yes; is it owned by higher EL? */
> > reg = read_sysreg_s(SYS_PMBIDR_EL1);
> > if (reg & BIT(SYS_PMBIDR_EL1_P_SHIFT))
> > @@ -142,7 +137,7 @@ __debug_save_spe_nvhe(struct kvm_cpu_context *ctxt, bool full_ctxt)
> > }
> >
> > static void __hyp_text
> > -__debug_restore_spe_nvhe(struct kvm_cpu_context *ctxt, bool full_ctxt)
> > +__debug_restore_spe_context(struct kvm_cpu_context *ctxt, bool full_ctxt)
> > {
> > if (!ctxt->sys_regs[PMSCR_EL1])
> > return;
> > @@ -210,11 +205,14 @@ void __hyp_text __debug_restore_guest_context(struct kvm_vcpu *vcpu)
> > struct kvm_guest_debug_arch *host_dbg;
> > struct kvm_guest_debug_arch *guest_dbg;
> >
> > + host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
> > + guest_ctxt = &vcpu->arch.ctxt;
> > +
> > + __debug_restore_spe_context(guest_ctxt, kvm_arm_spe_v1_ready(vcpu));
> > +
> > if (!(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY))
> > return;
> >
> > - host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
> > - guest_ctxt = &vcpu->arch.ctxt;
> > host_dbg = &vcpu->arch.host_debug_state.regs;
> > guest_dbg = kern_hyp_va(vcpu->arch.debug_ptr);
> >
> > @@ -232,8 +230,7 @@ void __hyp_text __debug_restore_host_context(struct kvm_vcpu *vcpu)
> > host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
> > guest_ctxt = &vcpu->arch.ctxt;
> >
> > - if (!has_vhe())
> > - __debug_restore_spe_nvhe(host_ctxt, false);
> > + __debug_restore_spe_context(host_ctxt, kvm_arm_spe_v1_ready(vcpu));
>
> So you now do an unconditional save/restore on the exit path for VHE as
> well? Even if the host isn't using the SPE HW? That's not acceptable
> as, in most cases, only the host /or/ the guest will use SPE. Here, you
> put a measurable overhead on each exit.
>
> If the host is not using SPE, then the restore/save should happen in
> vcpu_load/vcpu_put. Only if the host is using SPE should you do
> something in the run loop. Of course, this only applies to VHE and
> non-VHE must switch eagerly.
>
On VHE where SPE is used in the guest only - we save/restore in
vcpu_load/put.
On VHE where SPE is used in the host only - we save/restore in the run
loop.
On VHE where SPE is used in guest and host - we save/restore in the
run loop.
As the guest can't trace EL2 it doesn't matter if we restore guest SPE
early
in the vcpu_load/put functions. (I assume it doesn't matter that we
restore
an EL0/EL1 profiling buffer address at this point and enable tracing
given
that there is nothing to trace until entering the guest).
However the reason for moving save/restore to vcpu_load/put when the
host is
using SPE is to minimise the host EL2 black-out window.
On nVHE we always save/restore in the run loop. For the SPE
guest-use-only
use-case we can't save/restore in vcpu_load/put - because the guest
runs at
the same ELx level as the host - and thus doing so would result in the
guest
tracing part of the host.
Though if we determine that (for nVHE systems) the guest SPE is
profiling only
EL0 - then we could also save/restore in vcpu_load/put where SPE is
only being
used in the guest.
Does that make sense, are my reasons correct?