On 2021-11-10 11:45:57 [+0100], Uwe Kleine-König wrote: > Hello, Hi Uwe, > recently I debugged a problem on an -rt enabled kernel. The relevant > part of the analysed trace looks as follows: > > napi/can0-10-360 [001] d...312 3565.642595: sched_pi_setprio: comm=candump pid=2182 oldprio=120 newprio=14 > napi/can0-10-360 [001] d...212 3565.642619: sched_switch: prev_comm=napi/can0-10 prev_pid=360 prev_prio=14 prev_state=R ==> next_comm=cantest next_pid=915 next_prio=39 > .... > rcuc/0-15 [000] d...212 3565.642633: sched_switch: prev_comm=rcuc/0 prev_pid=15 prev_prio=98 prev_state=R+ ==> next_comm=candump next_pid=2182 next_prio=14 > candump-2182 [000] d...3.. 3565.642646: sched_pi_setprio: comm=candump pid=2182 oldprio=14 newprio=120 > > So the napi/can0-10 wants to grab a mutex that candump is holding. So > candump's priority is bumped from 120 to 14. > > However the napi/can0-10 process (and a few others) are pinned to cpu #1 > and cantest isn't allowed to run on that one. And so cpu #1 schedules a > lower prio task while candump still has to wait a moment before being > scheduled on cpu #0. > > I wonder if it would be sensible in such a case not only to increase the > importance of candump, but also to allow it to run on the cpu-set the > boosting process is allowed to run on until it releases the mutex. > > Would that make sense? Sounds like your problem could be solved by allowing candump to run on any CPU. Why not lift that restriction yourself? >From the trace, you have migration disabled for napi/can0. So that one can't be moved. If the lock, that is owned by candump, is a spinlock_t than candump has also migration disabled and can't be moved either. > Best regards > Uwe Sebastian