Re: [PATCH 1/3] KVM: x86: Fix out-of-bounds access in kvm_recalculate_phys_map()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 26, 2023, Michal Luczaj wrote:
> On 5/26/23 18:17, Sean Christopherson wrote:
> > On Fri, May 26, 2023, Michal Luczaj wrote:
> >> On 5/26/23 02:07, Sean Christopherson wrote:
> >>> On Thu, May 25, 2023, Michal Luczaj wrote:
> >>>> @@ -265,10 +265,14 @@ static int kvm_recalculate_phys_map(struct kvm_apic_map *new,
> >>>>  		 * mapped, i.e. is aliased to multiple vCPUs.  The optimized
> >>>>  		 * map requires a strict 1:1 mapping between IDs and vCPUs.
> >>>>  		 */
> >>>> -		if (apic_x2apic_mode(apic))
> >>>> +		if (apic_x2apic_mode(apic)) {
> >>>> +			if (x2apic_id > new->max_apic_id)
> >>>> +				return -EINVAL;
> >>>
> >>> Hmm, disabling the optimized map just because userspace created a new vCPU is
> >>> unfortunate and unnecessary.  Rather than return -EINVAL and only perform the
> >>> check when x2APIC is enabled, what if we instead do the check immediately and
> >>> return -E2BIG?  Then the caller can retry with a bigger array size.  Preemption
> >>> is enabled and retries are bounded by the number of possible vCPUs, so I don't
> >>> see any obvious issues with retrying.
> >>
> >> Right, that makes perfect sense.
> >>
> >> Just a note, it changes the logic a bit:
> >>
> >> - x2apic_format: an overflowing x2apic_id won't be silently ignored.
> > 
> > Nit, I wouldn't describe the current behavior as silently ignored.  KVM doesn't
> > ignore the case, KVM instead disables the optimized map.
> 
> I may be misusing "silently ignored",

You're not.

> but currently if (x2apic_format && apic_x2apic_mode && x2apic_id >
> new->max_apic_id) new->phys_map[x2apic_id] remains unchanged, then
> kvm_recalculate_phys_map() returns 0 (not -EINVAL).  I.e. this does not
> result in rcu_assign_pointer(kvm->arch.apic_map, NULL).

My bad.  I was thinking of the -EINVAL from your patch.

Hmm, thinking more about the "silently ignored" case, it's only temporarily
ignored.  If a x2APIC ID is out-of-bounds, then there had to have been a write
to a vCPU's x2APIC ID between the two instances of kvm_for_each_vcpu(), and that
means that whatever changed the ID must also mark apic_map_dirty DIRTY and call
kvm_recalculate_apic_map().  I.e. another recalc will soon occur and cleanup the
mess.

There's still value in retrying, but it's not as much as an optimization as it
first appears.  That means the bug fix can be a separate patch from the retry.
Oh, and the retry can also redo the DIRTY=>IN_PROGRESS so that whatever modified
a vCPU's x2APIC ID doesn't perform an unnecessary recalculation.

E.g. this (plus comments) for stable@

---
 arch/x86/kvm/lapic.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index e542cf285b51..7a83df7f45c5 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -228,6 +228,12 @@ static int kvm_recalculate_phys_map(struct kvm_apic_map *new,
 	u32 xapic_id = kvm_xapic_id(apic);
 	u32 physical_id;
 
+	if (WARN_ON_ONCE(xapic_id >= new->max_apic_id))
+		return -EINVAL;
+
+	if (x2apic_id >= new->max_apic_id)
+		return -E2BIG;
+
 	/*
 	 * Deliberately truncate the vCPU ID when detecting a mismatched APIC
 	 * ID to avoid false positives if the vCPU ID, i.e. x2APIC ID, is a
@@ -253,8 +259,7 @@ static int kvm_recalculate_phys_map(struct kvm_apic_map *new,
 	 */
 	if (vcpu->kvm->arch.x2apic_format) {
 		/* See also kvm_apic_match_physical_addr(). */
-		if ((apic_x2apic_mode(apic) || x2apic_id > 0xff) &&
-			x2apic_id <= new->max_apic_id)
+		if (apic_x2apic_mode(apic) || x2apic_id > 0xff)
 			new->phys_map[x2apic_id] = apic;
 
 		if (!apic_x2apic_mode(apic) && !new->phys_map[xapic_id])

base-commit: 39428f6ea9eace95011681628717062ff7f5eb5f
-- 


and then as a pure optimization

---
 arch/x86/kvm/lapic.c | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 7a83df7f45c5..09e13ded34ef 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -369,8 +369,9 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
 	struct kvm_apic_map *new, *old = NULL;
 	struct kvm_vcpu *vcpu;
 	unsigned long i;
-	u32 max_id = 255; /* enough space for any xAPIC ID */
-	bool xapic_id_mismatch = false;
+	u32 max_id;
+	bool xapic_id_mismatch;
+	int r;
 
 	/* Read kvm->arch.apic_map_dirty before kvm->arch.apic_map.  */
 	if (atomic_read_acquire(&kvm->arch.apic_map_dirty) == CLEAN)
@@ -380,9 +381,14 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
 		  "Dirty APIC map without an in-kernel local APIC");
 
 	mutex_lock(&kvm->arch.apic_map_lock);
+
+retry:
 	/*
-	 * Read kvm->arch.apic_map_dirty before kvm->arch.apic_map
-	 * (if clean) or the APIC registers (if dirty).
+	 * Read kvm->arch.apic_map_dirty before kvm->arch.apic_map (if clean)
+	 * or the APIC registers (if dirty).  Note, on retry the map may have
+	 * not yet been marked dirty by whatever task changed a vCPU's x2APIC
+	 * ID, i.e. the map may still show up as in-progress.  In that case
+	 * this task still needs to retry and copmlete its calculation.
 	 */
 	if (atomic_cmpxchg_acquire(&kvm->arch.apic_map_dirty,
 				   DIRTY, UPDATE_IN_PROGRESS) == CLEAN) {
@@ -391,6 +397,9 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
 		return;
 	}
 
+	max_id = 255; /* enough space for any xAPIC ID */
+	xapic_id_mismatch = false;
+
 	kvm_for_each_vcpu(i, vcpu, kvm)
 		if (kvm_apic_present(vcpu))
 			max_id = max(max_id, kvm_x2apic_id(vcpu->arch.apic));
@@ -409,9 +418,13 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
 		if (!kvm_apic_present(vcpu))
 			continue;
 
-		if (kvm_recalculate_phys_map(new, vcpu, &xapic_id_mismatch)) {
+		r = kvm_recalculate_phys_map(new, vcpu, &xapic_id_mismatch);
+		if (r) {
 			kvfree(new);
 			new = NULL;
+			if (r == -E2BIG)
+				goto retry;
+
 			goto out;
 		}
 

base-commit: 15cc2fa07b5867ae5a6f17656a6f272cfb393810
-- 



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux