On Tue, Mar 21, 2023 at 11:03 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > +Peter > > On Mon, Mar 20, 2023, Alexander Graf wrote: > > With protected state (like SEV-ES and SEV-SNP), KVM does not have direct > > access to guest registers. However, we deflect modifications to CR0, > > Please avoid pronouns in changelogs and comments. > > > CR4 and EFER to the host. We also carry the apic_base register and learn > > about CR8 directly from a VMCB field. > > > > That means these bits of information do exist in the host's KVM data > > structures. If we ever want to resume consumption of an already > > initialized VMSA (guest state), we will need to also restore these > > additional bits of KVM state. > > For some definitions of "need". I've looked at this code multiple times in the > past, and even posted patches[1], but I'm still unconvinced that trapping > CR0, CR4, and EFER updates is necessary[2], which is partly why series to fix > this stalled out. > > : If KVM chugs along happily without these patches, I'd love to pivot and yank out > : all of the CR0/4/8 and EFER trapping/tracking, and then make KVM_GET_SREGS a nop > : as well. > > [1] https://lore.kernel.org/all/20210507165947.2502412-1-seanjc@xxxxxxxxxx > [2] https://lore.kernel.org/all/YJla8vpwqCxqgS8C@xxxxxxxxxx Yea we are using similar patches to do intra-host migration for SNP VMs. I have dropped the ball on my AI from that thread. Let me look/test this patch. > > > Prepare ourselves for such a world by allowing set_sregs to set the > > relevant fields. This way, it mirrors get_sregs properly that already > > exposes them to user space. > > > > Signed-off-by: Alexander Graf <graf@xxxxxxxxxx> > > --- > > arch/x86/kvm/x86.c | 16 ++++++++++++++-- > > 1 file changed, 14 insertions(+), 2 deletions(-) > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 7713420abab0..88fa8b657a2f 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -11370,7 +11370,8 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs, > > int idx; > > struct desc_ptr dt; > > > > - if (!kvm_is_valid_sregs(vcpu, sregs)) > > + if (!vcpu->arch.guest_state_protected && > > + !kvm_is_valid_sregs(vcpu, sregs)) > > This is broken, userspace shouldn't be allowed to stuff complete garbage just > because guest state is protected. Agreed, we've been using something like this: kvm_valid_sregs() - if (sregs->fer & EFER_LME && !vcpu->arch.guest_state_protected) { + if (sregs->efer & EFER_LME) { > > > return -EINVAL; > > > > apic_base_msr.data = sregs->apic_base; > > @@ -11378,8 +11379,19 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs, > > if (kvm_set_apic_base(vcpu, &apic_base_msr)) > > return -EINVAL; > > > > - if (vcpu->arch.guest_state_protected) > > + if (vcpu->arch.guest_state_protected) { > > + /* > > + * While most actual guest state is hidden from us, > > + * CR{0,4,8}, efer and apic_base still hold KVM state > > + * with protection enabled, so let's allow restoring > > + */ > > + kvm_set_cr8(vcpu, sregs->cr8); > > + kvm_x86_ops.set_efer(vcpu, sregs->efer); > > + kvm_x86_ops.set_cr0(vcpu, sregs->cr0); > > Use static_call(). This code is also broken in that it doesn't set mmu_reset_needed. > This patch also misses the problematic behavior in svm_set_cr0(). > > And I don't like having a completely separate one-off flow for protected guests. > It's more code and arguably uglier, but I would prefer to explicitly skip each > chunk as needed so that we aren't effectively maintaining multiple flows. > > Easiest thing is probably to just resurrect my aforementioned series, pending the > result of the discussion on whether or not KVM should be trapping CR0/CR4/EFER > writes in the first place.