Re: [patch 2/2] x86/kvm: Sanitize kvm_async_pf_task_wait()

Andy Lutomirski <luto@xxxxxxxxxx> · Sat, 7 Mar 2020 11:30:33 -0800

On Sat, Mar 7, 2020 at 11:18 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> Andy Lutomirski <luto@xxxxxxxxxx> writes:
> > On Sat, Mar 7, 2020 at 7:10 AM Andy Lutomirski <luto@xxxxxxxxxx> wrote:
> >> On Sat, Mar 7, 2020 at 2:01 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> >> >
> >> > Andy Lutomirski <luto@xxxxxxxxxx> writes:
> >
> >> Now I'm confused again.  Your patch is very careful not to schedule if
> >> we're in an RCU read-side critical section, but the regular preemption
> >> code (preempt_schedule_irq, etc) seems to be willing to schedule
> >> inside an RCU read-side critical section.  Why is the latter okay but
> >> not the async pf case?
> >
> > I read more docs.  I guess the relevant situation is
> > CONFIG_PREEMPT_CPU, in which case it is legal to preempt an RCU
> > read-side critical section and obviously legal to put the whole CPU to
> > sleep, but it's illegal to explicitly block in an RCU read-side
> > critical section.  So I have a question for Paul: is it, in fact,
> > entirely illegal to block or merely illegal to block for an
> > excessively long time, e.g. waiting for user space or network traffic?
>
> Two issues here:
>
>     - excessive blocking time

We can't do anything about this.  We are blocked until the host says
otherwise, and the critical section cannot end until the host lets it
end.

>
>     - entering idle with an RCU read side critical section blocking

We could surely make this work.  I'm not at all convinced it's
worthwhile, though.

>
> >  In this situation, we cannot make progress until the host says we
> > can, so we are, in effect, blocking until the host tells us to stop
> > blocking.  Regardless, I agree that turning IRQs on is reasonable, and
> > allowing those IRQs to preempt us is reasonable.
> >
> > As it stands in your patch, the situation is rather odd: we'll run
> > another task if that task *preempts* us (e.g. we block long enough to
> > run out of our time slice), but we won't run another task if we aren't
> > preempted.  This seems bizarre.
>
> Yes, it looks odd. We could do:
>
>         preempt_disable();
>         while (!page_arrived()) {
>                 if (preempt_count() == 1 && this_cpu_runnable_tasks() > 1) {
>                         set_need_resched();
>                         schedule_preempt_disabled();

The downside here is that the scheduler may immediately reschedule us,
thus accomplishing nothing whatsoever.

>                 } else {
>                         native_safe_halt();
>                         local_irq_disable();
>                 }
>         }
>         preempt_enable();
>
> Don't know if it's worth the trouble. But that's not the problem :)

I suspect that we should either declare it entirely not worth the
trouble and do it like in your patch or we should teach preempt-rcu to
handle the special case of going idle while in a read-side critical
section.  For all I know, the latter is trivial, but it could easily
be a total disaster.  Paul?