On Tue, 21 Dec 2021 17:22:34 +0000 Valentin Schneider <valentin.schneider@xxxxxxx> wrote: > On 21/12/21 16:45, John Keeping wrote: > > On Tue, 21 Dec 2021 16:11:34 +0000 > > Valentin Schneider <valentin.schneider@xxxxxxx> wrote: > > > >> On 20/12/21 18:35, Dietmar Eggemann wrote: > >> > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c > >> > index ef8228d19382..798887f1eeff 100644 > >> > --- a/kernel/sched/rt.c > >> > +++ b/kernel/sched/rt.c > >> > @@ -1895,9 +1895,17 @@ static int push_rt_task(struct rq *rq, bool pull) > >> > struct task_struct *push_task = NULL; > >> > int cpu; > >> > > >> > + if (WARN_ON_ONCE(!rt_task(rq->curr))) { > >> > + printk("next_task=[%s %d] rq->curr=[%s %d]\n", > >> > + next_task->comm, next_task->pid, rq->curr->comm, rq->curr->pid); > >> > + } > >> > + > >> > if (!pull || rq->push_busy) > >> > return 0; > >> > > >> > + if (!rt_task(rq->curr)) > >> > + return 0; > >> > + > >> > >> If current is a DL/stopper task, why not; if that's CFS (which IIUC is your > >> case), that's buggered: we shouldn't be trying to pull RT tasks when we > >> have queued RT tasks and a less-than-RT current, we should be rescheduling > >> right now. > >> > >> I'm thinking this can happen via rt_mutex_setprio() when we demote an RT-boosted > >> CFS task (or straight up sched_setscheduler()): > >> check_class_changed()->switched_from_rt() doesn't trigger a resched_curr(), > >> so I suspect we get to the push/pull callback before getting a > >> resched (I actually don't see where we'd get a resched in that case other > >> than at the next tick). > >> > >> IOW, feels like we want the below. Unfortunately I can't reproduce the > >> issue locally (yet), so that's untested. > > > > This patch doesn't make any difference for me - I hit the BUG on the > > first boot with this applied. > > > > Thanks for the swift testing! > > Did you give Dietmar's patch a try? ITSM it lacks a resched_curr(), but if > we can somehow get to the push IRQ work before rescheduling (which I think > might happen if we try to resched_curr(this_rq)), then we need his > bailout. With Dietmar's patch I hit the added WARN_ON_ONCE with: next_task=[rcu_preempt 11] rq->curr=[ksoftirqd/1 21] next_task=[rcu_preempt 11] rq->curr=[ksoftirqd/1 21] # ps -eTo comm,pid,lwp,pri,rtprio,nice,class ... rcu_preempt 11 11 41 1 - FF ... ksoftirqd/1 21 21 19 - 0 TS Out of three reproductions, rcu_preempt as next_task is consistent, but I've seen three different tasks in rq->curr (although all with SCHED_OTHER). And as expected, the added early return does stop the BUG.