On 06/21/2019 10:38 AM, Marc Zyngier wrote: > From: Andre Przywara <andre.przywara@xxxxxxx> > > Whenever we need to restore the guest's system registers to the CPU, we > now need to take care of the EL2 system registers as well. Most of them > are accessed via traps only, but some have an immediate effect and also > a guest running in VHE mode would expect them to be accessible via their > EL1 encoding, which we do not trap. > > Split the current __sysreg_{save,restore}_el1_state() functions into > handling common sysregs, then differentiate between the guest running in > vEL2 and vEL1. > > For vEL2 we write the virtual EL2 registers with an identical format directly > into their EL1 counterpart, and translate the few registers that have a > different format for the same effect on the execution when running a > non-VHE guest guest hypervisor. > > [ Commit message reworked and many bug fixes applied by Marc Zyngier > and Christoffer Dall. ] > > Signed-off-by: Andre Przywara <andre.przywara@xxxxxxx> > Signed-off-by: Marc Zyngier <marc.zyngier@xxxxxxx> > Signed-off-by: Christoffer Dall <christoffer.dall@xxxxxxx> > --- > arch/arm64/kvm/hyp/sysreg-sr.c | 160 +++++++++++++++++++++++++++++++-- > 1 file changed, 153 insertions(+), 7 deletions(-) > > diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c > index 62866a68e852..2abb9c3ff24f 100644 > --- a/arch/arm64/kvm/hyp/sysreg-sr.c > +++ b/arch/arm64/kvm/hyp/sysreg-sr.c [...] > @@ -124,10 +167,91 @@ static void __hyp_text __sysreg_restore_user_state(struct kvm_cpu_context *ctxt) > write_sysreg(ctxt->sys_regs[TPIDRRO_EL0], tpidrro_el0); > } > > -static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt) > +static void __sysreg_restore_vel2_state(struct kvm_cpu_context *ctxt) > { > + u64 val; > + > + write_sysreg(read_cpuid_id(), vpidr_el2); > write_sysreg(ctxt->sys_regs[MPIDR_EL1], vmpidr_el2); > - write_sysreg(ctxt->sys_regs[CSSELR_EL1], csselr_el1); > + write_sysreg_el1(ctxt->sys_regs[MAIR_EL2], SYS_MAIR); > + write_sysreg_el1(ctxt->sys_regs[VBAR_EL2], SYS_VBAR); > + write_sysreg_el1(ctxt->sys_regs[CONTEXTIDR_EL2],SYS_CONTEXTIDR); > + write_sysreg_el1(ctxt->sys_regs[AMAIR_EL2], SYS_AMAIR); > + > + if (__vcpu_el2_e2h_is_set(ctxt)) { > + /* > + * In VHE mode those registers are compatible between > + * EL1 and EL2. > + */ > + write_sysreg_el1(ctxt->sys_regs[SCTLR_EL2], SYS_SCTLR); > + write_sysreg_el1(ctxt->sys_regs[CPTR_EL2], SYS_CPACR); > + write_sysreg_el1(ctxt->sys_regs[TTBR0_EL2], SYS_TTBR0); > + write_sysreg_el1(ctxt->sys_regs[TTBR1_EL2], SYS_TTBR1); > + write_sysreg_el1(ctxt->sys_regs[TCR_EL2], SYS_TCR); > + write_sysreg_el1(ctxt->sys_regs[CNTHCTL_EL2], SYS_CNTKCTL); > + } else { > + write_sysreg_el1(translate_sctlr(ctxt->sys_regs[SCTLR_EL2]), > + SYS_SCTLR); > + write_sysreg_el1(translate_cptr(ctxt->sys_regs[CPTR_EL2]), > + SYS_CPACR); > + write_sysreg_el1(translate_ttbr0(ctxt->sys_regs[TTBR0_EL2]), > + SYS_TTBR0); > + write_sysreg_el1(translate_tcr(ctxt->sys_regs[TCR_EL2]), > + SYS_TCR); > + write_sysreg_el1(translate_cnthctl(ctxt->sys_regs[CNTHCTL_EL2]), > + SYS_CNTKCTL); > + } > + > + /* > + * These registers can be modified behind our back by a fault > + * taken inside vEL2. Save them, always. > + */ > + write_sysreg_el1(ctxt->sys_regs[ESR_EL2], SYS_ESR); > + write_sysreg_el1(ctxt->sys_regs[AFSR0_EL2], SYS_AFSR0); > + write_sysreg_el1(ctxt->sys_regs[AFSR1_EL2], SYS_AFSR1); > + write_sysreg_el1(ctxt->sys_regs[FAR_EL2], SYS_FAR); > + write_sysreg(ctxt->sys_regs[SP_EL2], sp_el1); > + write_sysreg_el1(ctxt->sys_regs[ELR_EL2], SYS_ELR); > + > + val = __fixup_spsr_el2_write(ctxt, ctxt->sys_regs[SPSR_EL2]); > + write_sysreg_el1(val, SYS_SPSR); > +} > + > +static void __hyp_text __sysreg_restore_vel1_state(struct kvm_cpu_context *ctxt) > +{ > + u64 mpidr; > + > + if (has_vhe()) { > + struct kvm_vcpu *vcpu; > + > + /* > + * Warning: this hack only works on VHE, because we only > + * call this with the *guest* context, which is part of > + * struct kvm_vcpu. On a host context, you'd get pure junk. > + */ > + vcpu = container_of(ctxt, struct kvm_vcpu, arch.ctxt); This seems very fragile, just to find out whether the guest has hyp capabilities. It would be at least nice to make sure this is indeed a guest context. The *clean* way to do it could be to have a pointer to kvm_vcpu in the kvm_cpu_context which would be NULL for host contexts. Otherwise, I'm under the impression that for a host context, ctxt->sys_reg[HCR_EL2] == 0 and that this would also be true for a guest without nested virt capability. Could we use something like that here? Cheers, Julien