+Peter On Mon, Mar 20, 2023, Alexander Graf wrote: > With protected state (like SEV-ES and SEV-SNP), KVM does not have direct > access to guest registers. However, we deflect modifications to CR0, Please avoid pronouns in changelogs and comments. > CR4 and EFER to the host. We also carry the apic_base register and learn > about CR8 directly from a VMCB field. > > That means these bits of information do exist in the host's KVM data > structures. If we ever want to resume consumption of an already > initialized VMSA (guest state), we will need to also restore these > additional bits of KVM state. For some definitions of "need". I've looked at this code multiple times in the past, and even posted patches[1], but I'm still unconvinced that trapping CR0, CR4, and EFER updates is necessary[2], which is partly why series to fix this stalled out. : If KVM chugs along happily without these patches, I'd love to pivot and yank out : all of the CR0/4/8 and EFER trapping/tracking, and then make KVM_GET_SREGS a nop : as well. [1] https://lore.kernel.org/all/20210507165947.2502412-1-seanjc@xxxxxxxxxx [2] https://lore.kernel.org/all/YJla8vpwqCxqgS8C@xxxxxxxxxx > Prepare ourselves for such a world by allowing set_sregs to set the > relevant fields. This way, it mirrors get_sregs properly that already > exposes them to user space. > > Signed-off-by: Alexander Graf <graf@xxxxxxxxxx> > --- > arch/x86/kvm/x86.c | 16 ++++++++++++++-- > 1 file changed, 14 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 7713420abab0..88fa8b657a2f 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -11370,7 +11370,8 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs, > int idx; > struct desc_ptr dt; > > - if (!kvm_is_valid_sregs(vcpu, sregs)) > + if (!vcpu->arch.guest_state_protected && > + !kvm_is_valid_sregs(vcpu, sregs)) This is broken, userspace shouldn't be allowed to stuff complete garbage just because guest state is protected. > return -EINVAL; > > apic_base_msr.data = sregs->apic_base; > @@ -11378,8 +11379,19 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs, > if (kvm_set_apic_base(vcpu, &apic_base_msr)) > return -EINVAL; > > - if (vcpu->arch.guest_state_protected) > + if (vcpu->arch.guest_state_protected) { > + /* > + * While most actual guest state is hidden from us, > + * CR{0,4,8}, efer and apic_base still hold KVM state > + * with protection enabled, so let's allow restoring > + */ > + kvm_set_cr8(vcpu, sregs->cr8); > + kvm_x86_ops.set_efer(vcpu, sregs->efer); > + kvm_x86_ops.set_cr0(vcpu, sregs->cr0); Use static_call(). This code is also broken in that it doesn't set mmu_reset_needed. This patch also misses the problematic behavior in svm_set_cr0(). And I don't like having a completely separate one-off flow for protected guests. It's more code and arguably uglier, but I would prefer to explicitly skip each chunk as needed so that we aren't effectively maintaining multiple flows. Easiest thing is probably to just resurrect my aforementioned series, pending the result of the discussion on whether or not KVM should be trapping CR0/CR4/EFER writes in the first place.