On 27/01/2022 16:40, Valentin Schneider wrote: > John reported that push_rt_task() can end up invoking > find_lowest_rq(rq->curr) when curr is not an RT task (in this case a CFS > one), which causes mayhem down convert_prio(). > > This can happen when current gets demoted to e.g. CFS when releasing an > rt_mutex, and the local CPU gets hit with an rto_push_work irqwork before > getting the chance to reschedule. Exactly who triggers this work isn't > entirely clear to me - switched_from_rt() only invokes rt_queue_pull_task() > if there are no RT tasks on the local RQ, which means the local CPU can't > be in the rto_mask. > > My current suspected sequence is something along the lines of the below, > with the demoted task being current. > > mark_wakeup_next_waiter() > rt_mutex_adjust_prio() > rt_mutex_setprio() // deboost originally-CFS task > check_class_changed() > switched_from_rt() // Only rt_queue_pull_task() if !rq->rt.rt_nr_running > switched_to_fair() // Sets need_resched > __balance_callbacks() // if pull_rt_task(), tell_cpu_to_push() can't select local CPU per the above > raw_spin_rq_unlock(rq) > > // need_resched is set, so task_woken_rt() can't > // invoke push_rt_tasks(). Best I can come up with is > // local CPU has rt_nr_migratory >= 2 after the demotion, so stays > // in the rto_mask, and then: > > <some other CPU running rto_push_irq_work_func() queues rto_push_work on this CPU> > push_rt_task() > // breakage follows here as rq->curr is CFS > > Move an existing check to check rq->curr vs the next pushable task's > priority before getting anywhere near find_lowest_rq(). While at it, add an > explicit sched_class of rq->curr check prior to invoking > find_lowest_rq(rq->curr). Align the DL logic to also reschedule regardless > of next_task's migratability. > > Link: http://lore.kernel.org/r/Yb3vXx3DcqVOi+EA@donbot > Fixes: a7c81556ec4d ("sched: Fix migrate_disable() vs rt/dl balancing") > Reported-by: John Keeping <john@xxxxxxxxxxxx> > Signed-off-by: Valentin Schneider <valentin.schneider@xxxxxxx> > Tested-by: John Keeping <john@xxxxxxxxxxxx> > --- > v1 -> v2: Reworded comments, added DL part (Dietmar) > --- LGTM. Rescheduling in case rq->curr has lower prio (including CFS tasks) than next_task and bailing out in case rq->curr is DL or stop-task prevents the bug from happening. The only small issue is the fact that, unlike in push_rt_task(), the DL logic only compares DL tasks (if (dl_task(rq->curr) ...), so you miss rescheduling when rq->curr is a lower priority non-DL task. Reviewed-by: Dietmar Eggemann <dietmar.eggemann@xxxxxxx> [...]