Il gio 6 mar 2025, 21:44 Sean Christopherson <seanjc@xxxxxxxxxx> ha scritto: > > Allowing the use of kvm_load_host_xsave_state() is really ugly, especially > > since the corresponding code is so simple: > > > > if (cpu_feature_enabled(X86_FEATURE_PKU) && vcpu->arch.pkru != 0) > > wrpkru(vcpu->arch.host_pkru); > > It's clearly not "so simple", because this code is buggy. > > The justification for using kvm_load_host_xsave_state() is that either KVM gets > the TDX state model correct and the existing flows Just Work, or we handle all > that state as one-offs and at best replicate concepts and flows, and at worst > have bugs that are unique to TDX, e.g. because we get the "simple" code wrong, > we miss flows that subtly consume state, etc. A typo doesn't change the fact that kvm_load_host_xsave_state is optimized with knowledge of the guest CR0 and CR4; faking the values so that the same field means both "exit value" and "guest value", just so that the common code does the right thing for pkru/xcr0/xss, is unmaintainable and conceptually just wrong. And while the change for XSS (and possibly other MSRs) is actually correct, it should be justified for both SEV-ES/SNP and TDX rather than sneaked into the TDX patches. While there could be other flows that consume guest state, they're just as bound to do the wrong thing if vcpu->arch is only guaranteed to be somehow plausible (think anything that for whatever reason uses cpu_role). There's no way the existing flows for !guest_state_protected should run _at all_ when the register state is not there. If they do, it's a bug and fixing them is the right thing to do (it may feel like whack-a-mole but isn't). See for example the KVM_CAP_SYNC_REGS calls that Adrian mentioned in the reply to 05/12, and for which I'll send a patch too: I haven't checked if it's possible, but I wonder if userspace could misuse sync_regs() to change guest xcr0 and xss, and trick the TDX *host* into running with some bits of xcr0 and xss unexpectedly cleared. TDX provides well known values for the host for all three registers that are being restored here, so there's no need to reuse code that is meant to do something completely different. I'm not singling out TDX, the same would be true for SEV-ES/SNP swap type C. Paolo Paolo