Re: Async page fault delivered while irq are disabled?

Frederic Weisbecker <frederic@xxxxxxxxxx> · Thu, 26 Dec 2019 18:28:27 +0100

On Mon, Dec 23, 2019 at 09:38:18AM +0100, Paolo Bonzini wrote:
> On 23/12/19 03:17, Frederic Weisbecker wrote:
> > On Fri, Dec 20, 2019 at 10:34:20AM +0100, Paolo Bonzini wrote:
> >> On 19/12/19 20:00, Sean Christopherson wrote:
> >>>> And one last silly question, what about that line in
> >>>> kvm_arch_can_inject_async_page_present:
> >>>>
> >>>> 	if (!(vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED))
> >>>> 		return true;
> >>>>
> >>>> That looks weird, also it shortcuts the irqs_allowed() check.
> >>>
> >>> I wondered about that code as well :-).  Definitely odd, but it would
> >>> require the guest to disable async #PF after an async #PF is queued.  Best
> >>> guess is the idea is that it's the guest's problem if it disables async #PF
> >>> on the fly.
> >>>
> >>
> >> When the guest disables async #PF all outstanding page faults are
> >> cancelled by kvm_clear_async_pf_completion_queue.  However, in case they
> >> complete while in cancel_work_sync. you need to inject them even if
> >> interrupts are disabled.
> > 
> > Hmm, shouldn't the guest wait for the whole pending waitqueue in kvm_async_pf_task_wait()
> > to be serviced and woken up before actually allowing to disable async #PF ?
> > Because you can't really afford to inject those #PF while IRQs are disabled,
> > that's a big rq deadlock risk.
> 
> That's just how Linux works, and Linux doesn't ever disable async page
> faults with disabled IRQ (reboot_notifier_list is a blocking notifier).

So when I talk about IRQs enabled requirement, this is to prevent the page fault from
interrupting code that may hold a lock.

Now in those case I think we are good, as kvm_pv_guest_cpu_reboot() is called from
a generic IPI (rq and others shouldn't be held at that time) and kvm_guest_cpu_offline()
is called from a thread with interrupts disabled.

Anyway those semantics and expectations are very obscure. Probably those async page
faults should be considered as IRQs from lockdep POV.