On Thu, 08 Apr 2021 16:36:40 +0100, Alexandru Elisei <alexandru.elisei@xxxxxxx> wrote: > > Hi Marc, > > On 4/7/21 7:13 PM, Marc Zyngier wrote: > > On vcpu reset, we expect all the registers to be brought back > > to their initial state, which happens to be a bunch of zeroes. > > > > However, some recent commit broke this, and is now leaving a bunch > > of registers (such as a FP state) with whatever was left by the > > guest. My bad. > > > > Just zero the whole vcpu context on reset. It is more than we > > strictly need, but at least we won't miss anything. This also > > zeroes the __hyp_running_vcpu pointer, which is always NULL > > for a vcpu anyway. > > Had a look at struct kvm_cpu_context and indeed the only field which doesn't > represent a guest register is __hyp_running_vcpu. Did a grep for all the places > where __hyp_running_vcpu is used, and indeed the assumption is that for a guest > the pointer is NULL, as __sysreg_restore_el1_state() relies on it. > > > > > Cc: stable@xxxxxxxxxxxxxxx > > Fixes: e47c2055c68e ("KVM: arm64: Make struct kvm_regs userspace-only") > > Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx> > > --- > > arch/arm64/kvm/reset.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c > > index bd354cd45d28..ef1c49a1a3ad 100644 > > --- a/arch/arm64/kvm/reset.c > > +++ b/arch/arm64/kvm/reset.c > > @@ -240,8 +240,8 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu) > > break; > > } > > > > - /* Reset core registers */ > > - memset(vcpu_gp_regs(vcpu), 0, sizeof(*vcpu_gp_regs(vcpu))); > > + /* Zero all registers */ > > + memset(&vcpu->arch.ctxt, 0, sizeof(vcpu->arch.ctxt)); > > Checked that code earlier in the function does not touch the guest > registers from vcpu->arch.ctxt, to make sure we're not overwriting > other reset values by mistake. > Looks good to me: > > Reviewed-by: Alexandru Elisei <alexandru.elisei@xxxxxxx> Scratch that, this is breaks the setting of CNTVOFF, which gets populated when we create the vcpu. The gotcha is that creating a vcpu resets CNTVOFF for *all* vcpus: * If the VMM creates all vcpus, then reset them all, this works "fine": all the vcpus have CNTVOFF==0, which is an acceptable departure from the current behaviour (where vtime starts at 0). * If the VMM alternates vcpu creation and reset, then the last vcpu ends up with a CNTVOFF set to 0, while all the others have a different offset. QEMU does the former, and kvmtool the latter. Thanks to Will for the heads up. I'll drop the patch from -next and post a v2 shortly. Thanks, M. -- Without deviation from the norm, progress is not possible.