Andy Lutomirski <luto@xxxxxxxxxx> writes: > On Sat, Mar 7, 2020 at 7:10 AM Andy Lutomirski <luto@xxxxxxxxxx> wrote: >> On Sat, Mar 7, 2020 at 2:01 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: >> > >> > Andy Lutomirski <luto@xxxxxxxxxx> writes: > >> Now I'm confused again. Your patch is very careful not to schedule if >> we're in an RCU read-side critical section, but the regular preemption >> code (preempt_schedule_irq, etc) seems to be willing to schedule >> inside an RCU read-side critical section. Why is the latter okay but >> not the async pf case? > > I read more docs. I guess the relevant situation is > CONFIG_PREEMPT_CPU, in which case it is legal to preempt an RCU > read-side critical section and obviously legal to put the whole CPU to > sleep, but it's illegal to explicitly block in an RCU read-side > critical section. So I have a question for Paul: is it, in fact, > entirely illegal to block or merely illegal to block for an > excessively long time, e.g. waiting for user space or network traffic? Two issues here: - excessive blocking time - entering idle with an RCU read side critical section blocking > In this situation, we cannot make progress until the host says we > can, so we are, in effect, blocking until the host tells us to stop > blocking. Regardless, I agree that turning IRQs on is reasonable, and > allowing those IRQs to preempt us is reasonable. > > As it stands in your patch, the situation is rather odd: we'll run > another task if that task *preempts* us (e.g. we block long enough to > run out of our time slice), but we won't run another task if we aren't > preempted. This seems bizarre. Yes, it looks odd. We could do: preempt_disable(); while (!page_arrived()) { if (preempt_count() == 1 && this_cpu_runnable_tasks() > 1) { set_need_resched(); schedule_preempt_disabled(); } else { native_safe_halt(); local_irq_disable(); } } preempt_enable(); Don't know if it's worth the trouble. But that's not the problem :) > I think this issue still stands and is actually a fairly easy race to hit. > > STI > IRQ happens and we get preempted > another task runs and gets the #PF "async pf wakeup" event > reschedule, back to original task > HLT See the other mail about STI :) Thanks, tglx