On Fri, May 26, 2023, Michal Luczaj wrote: > On 5/26/23 02:07, Sean Christopherson wrote: > > On Thu, May 25, 2023, Michal Luczaj wrote: > >> @@ -265,10 +265,14 @@ static int kvm_recalculate_phys_map(struct kvm_apic_map *new, > >> * mapped, i.e. is aliased to multiple vCPUs. The optimized > >> * map requires a strict 1:1 mapping between IDs and vCPUs. > >> */ > >> - if (apic_x2apic_mode(apic)) > >> + if (apic_x2apic_mode(apic)) { > >> + if (x2apic_id > new->max_apic_id) > >> + return -EINVAL; > > > > Hmm, disabling the optimized map just because userspace created a new vCPU is > > unfortunate and unnecessary. Rather than return -EINVAL and only perform the > > check when x2APIC is enabled, what if we instead do the check immediately and > > return -E2BIG? Then the caller can retry with a bigger array size. Preemption > > is enabled and retries are bounded by the number of possible vCPUs, so I don't > > see any obvious issues with retrying. > > Right, that makes perfect sense. > > Just a note, it changes the logic a bit: > > - x2apic_format: an overflowing x2apic_id won't be silently ignored. Nit, I wouldn't describe the current behavior as silently ignored. KVM doesn't ignore the case, KVM instead disables the optimized map. > - !x2apic_format: -E2BIG even for !apic_x2apic_mode() leads to an realloc > instead of "new->phys_map[xapic_id] = apic" right away. > > > @@ -228,6 +228,12 @@ static int kvm_recalculate_phys_map(struct kvm_apic_map *new, > > u32 xapic_id = kvm_xapic_id(apic); > > u32 physical_id; > > > > + if (WARN_ON_ONCE(xapic_id >= new->max_apic_id)) > > + return -EINVAL; > > Shouldn't it be ">" instead of ">="? /facepalm Yes, I was reading it as the number of IDs, not the max. > That said, xapic_id > new->max_apic_id means something terrible has happened as > kvm_xapic_id() returns u8 and max_apic_id should never be less than 255. Does > this qualify for KVM_BUG_ON? I don't think so? The intent of the WARN is mostly to document that KVM always allocates enough space for xAPIC IDs, and to guard against that changing in the future. In the latter case, there's no need to kill the VM despite there being a KVM bug since running with the optimized map disabled is functionally ok. If the WARN fires because of host data corruption, then so be it. > > + if (x2apic_id >= new->max_apic_id) > > + return -E2BIG; > > Probably ">"? Ya. > > @@ -366,6 +371,7 @@ void kvm_recalculate_apic_map(struct kvm *kvm) > > unsigned long i; > > u32 max_id = 255; /* enough space for any xAPIC ID */ > > bool xapic_id_mismatch = false; > > + int r; > > > > /* Read kvm->arch.apic_map_dirty before kvm->arch.apic_map. */ > > if (atomic_read_acquire(&kvm->arch.apic_map_dirty) == CLEAN) > > @@ -386,6 +392,7 @@ void kvm_recalculate_apic_map(struct kvm *kvm) > > return; > > } > > > > +retry: > > kvm_for_each_vcpu(i, vcpu, kvm) > > if (kvm_apic_present(vcpu)) > > max_id = max(max_id, kvm_x2apic_id(vcpu->arch.apic)); > > @@ -404,9 +411,13 @@ void kvm_recalculate_apic_map(struct kvm *kvm) > > if (!kvm_apic_present(vcpu)) > > continue; > > > > - if (kvm_recalculate_phys_map(new, vcpu, &xapic_id_mismatch)) { > > + r = kvm_recalculate_phys_map(new, vcpu, &xapic_id_mismatch); > > + if (r) { > > kvfree(new); > > new = NULL; > > + if (r == -E2BIG) > > + goto retry; > > + > > goto out; > > } > > Maybe it's not important, but what about moving xapic_id_mismatch > (re)initialization after "retry:"? Oof, good catch. I think it makes sense to move max_id (re)initialization too, even though I can't imagine it would matter in practice.