On Tue, Mar 01, 2022, Maxim Levitsky wrote: > > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > > > index 80a2020c4db40..8d35f56c64020 100644 > > > --- a/arch/x86/kvm/lapic.c > > > +++ b/arch/x86/kvm/lapic.c > > > @@ -2618,15 +2618,14 @@ static int kvm_apic_state_fixup(struct kvm_vcpu *vcpu, > > > u32 *ldr = (u32 *)(s->regs + APIC_LDR); > > > u64 icr; > > > > > > - if (vcpu->kvm->arch.x2apic_format) { > > > - if (*id != vcpu->vcpu_id) > > > - return -EINVAL; > > > - } else { > > > - if (set) > > > - *id >>= 24; > > > - else > > > - *id <<= 24; > > > - } > > > + if (!vcpu->kvm->arch.x2apic_format && set) > > > + *id >>= 24; > > > + > > > + if (*id != vcpu->vcpu_id) > > > + return -EINVAL; > > > > This breaks backwards compability, userspace will start failing where it previously > > succeeded. It doesn't even require a malicious userspace, e.g. if userspace creates > > a vCPU with a vcpu_id > 255 vCPUs, the shift will truncate and cause failure. It'll > > obviously do weird things, but this code is 5.5 years old, I don't think it's worth > > trying to close a loophole that doesn't harm KVM. > > > > If we provide a way for userspace to opt into disallowiong APICID != vcpu_id, we > > can address this further upstream, e.g. reject vcpu_id > 255 without x2apic_format. > > This check is only when CPU is in x2apic mode. In this mode the X86 specs demands that > apic_id == vcpu_id. AMD's APM explicitly allows changing the x2APIC ID by writing the xAPIC ID and then switching to x2APIC mode. > When old API is used, APIC IDs are encoded in xapic format, even when vCPU is in x2apic > mode, meaning that apic id can't be more that 255 even in x2apic mode. Even on Intel CPUs, if userspace creates a vCPU with vcpu_id = 256, then the xAPIC ID will be (256 << 24) == 0. If userspace does get+set, then KVM will see 0 != 256 and reject the set with return -EINVAL. And as above, userspace that hasn't opted into the x2apic_format could also legitimately change the x2APIC ID. I 100% agree this is a mess and probably can't work without major shenanigans, but on the other hand this isn't harming anything, so why risk breaking userspace, even if the risk is infinitesimally small? I'm all for locking down the APIC ID with a userspace opt-in control, I just don't think it's worth trying to retroactively plug the various holes in KVM.