Re: [PATCH v8 003/103] KVM: Refactor CPU compatibility check on module initialization

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2022-08-15 at 22:35 +0000, Sean Christopherson wrote:
> On Fri, Aug 12, 2022, Huang, Kai wrote:
> > On Thu, 2022-08-11 at 17:39 +0000, Sean Christopherson wrote:
> > > I've been poking at the "hardware enable" code this week for other reasons, and
> > > have come to the conclusion that the current implementation is a mess.
> > 
> > Thanks for the lengthy reply :)
> > 
> > First of all, to clarify, I guess by "current implementation" you mean the
> > current upstream KVM code, but not this particular patch? :)
> 
> Yeah, upstream code.
> 
> > > Of course, that path is broken for other reasons too, e.g. needs to prevent CPUs
> > > from going on/off-line when KVM is enabling hardware.
> > > https://lore.kernel.org/all/20220216031528.92558-7-chao.gao@xxxxxxxxx
> > 
> > If I read correctly, the problem described in above link seems only to be true
> > after we move CPUHP_AP_KVM_STARTING from STARTING section to ONLINE section, but
> > this hasn't been done yet in the current upstream KVM.  Currently,
> > CPUHP_AP_KVM_STARTING is still in STARTING section so it is guaranteed it has
> > been executed before start_secondary sets itself to online cpu mask. 
> 
> The lurking issue is that for_each_online_cpu() can against hotplug, i.e. every
> instance of for_each_online_cpu() in KVM is buggy (at least on the x86 side, I
> can't tell at a glance whether or not arm pKVM's usage is safe).
> 
> https://lore.kernel.org/all/87bl20aa72.ffs@tglx

Yes agreed. for_each_online_cpu() can race with CPU hotplug.

But the fact is looks there are many places using for_each_online_cpus() w/o
holding cpus_read_lock().  :)

> 
> > Btw I saw v4 of Chao's patchset was sent Feb this year.  It seems that series
> > indeed improved CPU compatibility check and hotplug handling.  Any reason that
> > series wasn't merged?
> 
> AFAIK it was just a lack of reviews/acks for the non-KVM patches.
> 
> > Also agreed that kvm_lock should be used.  But I am not sure whether
> > cpus_read_lock() is needed (whether CPU hotplug should be prevented).  In
> > current KVM, we don't do CPU compatibility check for hotplug CPU anyway, so when
> > KVM does CPU compatibility check using for_each_online_cpu(), if CPU hotplug
> > (hot-removal) happens, the worst case is we lose compatibility check on that
> > CPU.
> > 
> > Or perhaps I am missing something?
> 
> On a hot-add of an incompatible CPU, KVM would potentially skip the compatibility
> check and try to enable hardware on an incompatible/broken CPU.

To resolve this, we need to do compatibility check before actually enabling
hardware on each cpu, as Chao's series did.  I don't see using cpus_read_lock()
alone can actually fix anything.

> 
> Another possible bug is the checking of hv_get_vp_assist_page(); hot-adding a
> CPU that failed to allocate the VP assist page while vmx_init() is checking online
> CPUs could result in a NULL pointer deref due to KVM not rejecting the CPU as it
> should.
> 

So we need Chao's series to fix those problems: 1) Do compatibility check before
actually enable the hardware for each cpu; 2) allow CPU hotplug to fail; 3) Hold
cpus_read_lock() when needed.


-- 
Thanks,
-Kai






[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux