Re: [PATCH v3] KVM: x86: Fix recording of guest steal time / preempted status

David Woodhouse <dwmw2@xxxxxxxxxxxxx> · Fri, 12 Nov 2021 11:29:07 +0000

On Fri, 2021-11-12 at 11:49 +0100, Paolo Bonzini wrote:
> > That would have worked nicely if the MMU notifier could call
> > scru_synchronize() on invalidation. Can it kick the vCPU and wait for
> > it to exit though?
> 
> Yes, there's kvm_make_all_cpus_request (see 
> kvm_arch_mmu_notifier_invalidate_range).  It can sleep, which is 
> theoretically wrong---but in practice non-blockable invalidations only 
> occur from the OOM reaper, so no CPU can be running.  If we care, we can 
> return early from kvm_arch_mmu_notifier_invalidate_range for 
> non-blockable invalidations.

OK, so these don't actually want any of that stuff with the rwlock and
the invalidation setting the pointer to KVM_UNMAPPED_PAGE that I did in
https://lore.kernel.org/kvm/20211101190314.17954-6-dwmw2@xxxxxxxxxxxxx/
for Xen event channels.

It looks like they want their own way of handling it; if the GPA being
invalidated matches posted_intr_desc_addr or virtual_apic_page_addr
then the MMU notifier just needs to call kvm_make_all_cpus_request()
with some suitable checking/WARN magic around the "we will never need
to sleep when we shouldn't" assertion that you made above.

(And a little bit more thinking about ordering for the case of
concurrent invalidation occurring while we are entering the L2 guest,
but I think that works out OK.)

We *could* use the rwlock thing for steal time reporting, but I still
don't really think it's worth doing so. Again, if it was truly going to
be a generic mechanism that would solve lots of other problems, I'd be
up for it. But if steal time would be the *only* other user of a
generic version of the rwlock thing, that just seems like
overengineering. I'm still mostly inclined to stand by my original
observation that it has a perfectly serviceable HVA that it can use
instead.

Attachment:
smime.p7s

Description: S/MIME cryptographic signature