Re: [PATCH] KVM: x86: async_pf: check earlier if can deliver async pf

Nikita Kalyazin <kalyazin@xxxxxxxxxx> · Wed, 27 Nov 2024 10:35:17 +0000

On 26/11/2024 22:10, Sean Christopherson wrote:
On Tue, Nov 26, 2024, Nikita Kalyazin wrote:
On 26/11/2024 00:06, Sean Christopherson wrote:
On Mon, Nov 25, 2024, Nikita Kalyazin wrote:
In both cases the fault handling code is blocked and the pCPU is free for
other tasks.  I can't see the vCPU spinning on the IO to get completed if
the async task isn't created.  I tried that with and without async PF
enabled by the guest (MSR_KVM_ASYNC_PF_EN).

What am I missing?

Ah, I was wrong about the vCPU spinning.

The goal is specifically to schedule() from KVM context, i.e. from kvm_vcpu_block(),
so that if a virtual interrupt arrives for the guest, KVM can wake the vCPU and
deliver the IRQ, e.g. to reduce latency for interrupt delivery, and possible even
to let the guest schedule in a different task if the IRQ is the guest's tick.

Letting mm/ or fs/ do schedule() means the only wake event even for the vCPU task
is the completion of the I/O (or whatever the fault is waiting on).

Ok, great, then that's how I understood it last time.  The only thing that
is not entirely clear to me is like Vitaly says, KVM_ASYNC_PF_SEND_ALWAYS is
no longer set, because we don't want to inject IRQs into the guest when it's
in kernel mode, but the "host async PF" case would still allow IRQs (eg
ticks like you said).  Why is it safe to deliver them?

IRQs are fine, the problem with PV async #PF is that it directly injects a #PF,
which the kernel may not be prepared to handle.

You're right indeed, I was overfocused on IRQs for some reason.

I have no objection to disabling host async page faults,
e.g. it's probably a net>>>>> negative for 1:1 vCPU:pCPU pinned setups, but such disabling
needs an opt-in from>>>>> userspace.
Back to this, I couldn't see a significant effect of this optimisation with
the original async PF so happy to give it up, but it does make a difference
when applied to async PF user [2] in my setup.  Would a new cap be a good
way for users to express their opt-in for it?

This probably needs to be handled in the context of the async #PF user series.
If that series never lands, adding a new cap is likely a waste.  And I suspect
that even then, a capability may not be warranted (truly don't know, haven't
looked at your other series).

Yes, I meant that to be included in the async #PF user series (if 
required), not this one.  Just wanted to bring it up here, because the 
thread already had the relevant context.  Thanks.