Re: [PATCH v3] KVM: x86: Fix recording of guest steal time / preempted status

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/12/21 10:54, David Woodhouse wrote:
I'm also slightly less comfortable with having the MMU notifier work
through an arbitrary *list* of gfn_to_pfn caches that it potentially
needs to invalidate, but that is very much a minor concern compared
with the first.

I started looking through the nested code which is the big user of this
facility.

Yes, that's also where I got stuck in my first attempt a few months ago.
   I agree that it can be changed to use gfn-to-hva caches, except for
the vmcs12->posted_intr_desc_addr and vmcs12->virtual_apic_page_addr.

... that anything accessing these will *still* need to do so in atomic
context. There's an atomic access which might fail, and then you fall
back to a context in which you can sleep to refresh the mapping. and
you *still* need to perform the actual access with the spinlock held to
protect against concurrent invalidation.

So let's take a look... for posted_intr_desc_addr, that host physical
address is actually written to the VMCS02, isn't it?

Thinking about the case where the target page is being invalidated
while the vCPU is running... surely in that case the only 'correct'
solution is that the vCPU needs to be kicked out of non-root mode
before the invalidate_range() notifier completes?

Yes.

That would have worked nicely if the MMU notifier could call
scru_synchronize() on invalidation. Can it kick the vCPU and wait for
it to exit though?

Yes, there's kvm_make_all_cpus_request (see kvm_arch_mmu_notifier_invalidate_range). It can sleep, which is theoretically wrong---but in practice non-blockable invalidations only occur from the OOM reaper, so no CPU can be running. If we care, we can return early from kvm_arch_mmu_notifier_invalidate_range for non-blockable invalidations.

Don't get me wrong, a big part of me *loves* the idea that the hairiest
part of my Xen event channel delivery is actually a bug fix that we
need in the kernel anyway, and then the rest of it is simple and
uncontentious.

(ISTR the virtual apic page is a bit different because it's only an
*address* and it doesn't even have to be backed by real memory at the
corresponding HPA? Otherwise it's basically the same issue?)

We do back it by real memory anyway, so it's the same.

Paolo




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux