Hi Jeroen, On 2020-07-10 6:50 p.m., Jeroen Roovers wrote: > On Thu, 9 Jul 2020 09:39:33 -0400 > John David Anglin <dave.anglin@xxxxxxxx> wrote: > >> On 2020-07-09 9:26 a.m., Rolf Eike Beer wrote: >>> Am Freitag, 3. Juli 2020, 22:32:35 CEST schrieb John David Anglin: >>>> Stalls are quite frequent with recent kernels. When the stall is >>>> detected by rcu_sched, we get a backtrace similar to the >>>> following: >>> With this patch on top of 5.7.7 I still get: >> Suggest enabling CONFIG_LOCKUP_DETECTOR=y and >> CONFIG_SOFTLOCKUP_DETECTOR=y so we can see where the stall occurs. >> >> Dave >> > Attached is kernel output while running the futex_requeue_pi test from > the kernel selftests. It failed this way on the second try while it > passed on the first try. The output it gave is with the kernel > configuration options as set out above. Unfortunately, the soft lockup detector didn't trigger in the output you attached. So, it's not clear where futex_requeue_p is stuck. There are no spinlocks in check_preempt_curr() that I can see. void check_preempt_curr(struct rq *rq, struct task_struct *p, int flags) { const struct sched_class *class; if (p->sched_class == rq->curr->sched_class) { rq->curr->sched_class->check_preempt_curr(rq, p, flags); } else { for_each_class(class) { if (class == rq->curr->sched_class) break; if (class == p->sched_class) { resched_curr(rq); break; } } } /* * A queue event has occurred, and we're going to schedule. In * this case, we can save a useless back to back clock update. */ if (task_on_rq_queued(rq->curr) && test_tsk_need_resched(rq->curr)) rq_clock_skip_update(rq, true); } There's one loop in the above code. I have CONFIG_PREEMPT_NONE=y in my kernel builds. Regards, Dave -- John David Anglin dave.anglin@xxxxxxxx