On 29/09/2017 12:34, Peter Zijlstra wrote: > On Fri, Sep 29, 2017 at 12:01:24PM +0200, Paolo Bonzini wrote: >>> Does this mean whenever we get a page fault in a RCU read-side critical >>> section, we may hit this? >>> >>> Could we simply avoid to schedule() in kvm_async_pf_task_wait() if the >>> fault process is in a RCU read-side critical section as follow? >>> >>> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c >>> index aa60a08b65b1..291ea13b23d2 100644 >>> --- a/arch/x86/kernel/kvm.c >>> +++ b/arch/x86/kernel/kvm.c >>> @@ -140,7 +140,7 @@ void kvm_async_pf_task_wait(u32 token) >>> >>> n.token = token; >>> n.cpu = smp_processor_id(); >>> - n.halted = is_idle_task(current) || preempt_count() > 1; >>> + n.halted = is_idle_task(current) || preempt_count() > 1 || rcu_preempt_depth(); >>> init_swait_queue_head(&n.wq); >>> hlist_add_head(&n.link, &b->list); >>> raw_spin_unlock(&b->lock); >>> >>> (Add KVM folks and list Cced) >> >> Yes, that would work. Mind to send it as a proper patch? > > I'm confused, why would we do an ASYNC PF at all here? Thing is, a > printk() shouldn't trigger a major fault _ever_. At worst it triggers > something like a vmalloc minor fault. And I'm thinking we should not do > the whole ASYNC machinery for minor faults. Async page faults are page faults _on the host_ side, and you cannot control what the host pages out. Of course the hypervisor filters out some cases itself (e.g. IF=0) but in general you could get one at any time. Paolo