On Wed, Feb 05, 2025, David Woodhouse wrote: > On Wed, 2025-02-05 at 07:06 -0800, Sean Christopherson wrote: > > On Wed, Feb 05, 2025, David Woodhouse wrote: > > > Especially as there is a corresponding requirement that they never be set > > > from host context (which is where the potential locking issues come in). > > > Which train of thought leads me to ponder this as an alternative (or > > > additional) solution: > > > > > > --- a/arch/x86/kvm/x86.c > > > +++ b/arch/x86/kvm/x86.c > > > @@ -3733,7 +3733,13 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > > > u32 msr = msr_info->index; > > > u64 data = msr_info->data; > > > > > > - if (msr && msr == vcpu->kvm->arch.xen_hvm_config.msr) > > > + /* > > > + * Do not allow host-initiated writes to trigger the Xen hypercall > > > + * page setup; it could incur locking paths which are not expected > > > + * if userspace sets the MSR in an unusual location. > > > > That's just as likely to break userspace. Doing a save/restore on the MSR doesn't > > make a whole lot of sense since it's effectively a "command" MSR, but IMO it's not > > any less likely than userspace putting the MSR index outside of the synthetic range. > > Save/restore on the MSR makes no sense. It's a write-only MSR; writing > to it has no effect *other* than populating the target page. In KVM we > don't implement reading from it at all; I don't think Xen does either? Hah, that's another KVM bug, technically. KVM relies on the MSR not being handled in order to generate the write-only semantics, but if the MSR index collides with an MSR that KVM emulates, then the MSR would be readable. KVM supports Hyper-V's HV_X64_MSR_TSC_INVARIANT_CONTROL (0x40000118), so just a few hundred more MSRs until fireworks :-) If we want to close that hole, it'd be easy enough to add a check in kvm_get_msr_common(). > Those two happen in reverse chronological order, don't they? And in the > lower one the comment tells you that hyperv_enabled() doesn't work yet. > When the higher one is called later, it calls kvm_xen_init() *again* to > put the MSR in the right place. > > It could be prettier, but I don't think it's broken, is it? Gah, -ENOCOFFEE. > > Userspace breakage aside, disallowng host writes would fix the immediate issue, > > and I think would mitigate all concerns with putting the host at risk. But it's > > not enough to actually make an overlapping MSR index work. E.g. if the MSR is > > passed through to the guest, the write will go through to the hardware MSR, unless > > the WRMSR happens to be emulated. > > > > I really don't want to broadly support redirecting any MSR, because to truly go > > down that path we'd need to deal with x2APIC, EFER, and other MSRs that have > > special treatment and meaning. > > > > While KVM's stance is usually that a misconfigured vCPU model is userspace's > > problem, in this case I don't see any value in letting userspace be stupid. It > > can't work generally, it creates unique ABI for KVM_SET_MSRS, and unless there's > > a crazy use case I'm overlooking, there's no sane reason for userspace to put the > > index in outside of the synthetic range (whereas defining seemingly nonsensical > > CPUID feature bits is useful for testing purposes, implementing support in > > userspace, etc). > > Right, I think we should do *both*. Blocking host writes solves the > issue of locking problems with the hypercall page setup. All it would > take for that issue to recur is for us (or Microsoft) to invent a new > MSR in the synthetic range which is also written on vCPU init/reset. > And then the sanity check on where the VMM puts the Xen MSR doesn't > save us. Ugh, indeed. MSRs are quite the conundrum. Userspace MSR filters have a similar problem, where it's impossible to know the semantics of future hardware MSRs, and so it's impossible to document which MSRs userspace is allowed to intercept :-/ Oh! It doesn't help KVM avoid breaking userspace, but a way for QEMU to avoid a future collision would be to have QEMU start at 0x40000200 when Hyper-V is enabled, but then use KVM_GET_MSR_INDEX_LIST to detect a collision with KVM Hyper-V, e.g. increment the index until an available index is found (with sanity checks and whatnot). > But yes, we should *also* do that sanity check. Ah, I'm a-ok with that.