On 2021-11-12 19:35:04 [+0100], Uwe Kleine-König wrote: > Hello Sebastian, Hi Uwe, > > Sounds like your problem could be solved by allowing candump to run on > > any CPU. Why not lift that restriction yourself? > > I want to give two answers here: > > - There are some realtime requirements on this machine. To get > the latency of the relevant userspace application down, cpu #1 is > isolated and only runs the application (here in this test "cantest"), > the can napi thread and the can irq thread. > Do you want to suggest that this isn't a good idea? I don't know the whole setup but isolating CPUs doesn't sound bad. Also the part where CPU0 had a task with "lower" priority was brief. > - Consider three processes A, B and C with increasing priorities (so C > is the most important). If A holds a lock that C wants to grab, the > kernel today already ensures (in the absence of cpu restrictions) > that A is scheduled before B. > In the presence of cpu restrictions this fails as this case shows: > B and C are pinned to cpu #1, A must not run on cpu #1. Then it can > happen that C waits for A but cpu #1 schedules B even though C being > blocked should be more important than to run B and so A should be > run. > So I think while you are right that I could just allow candump to run > on cpu #1 here, this is a corner case where the priority inversion > handling isn't doing the right thing. Correct. Today we can disable migration after a lock has been acquired. "Earlier" we needed to disable migration first and then acquire lock. So it was possible that pinned the task to a CPU which was idle at the time and then we acquired the lock and couldn't run because a task with higher priority was blocking us. So things improved here ;) Even if we ignore the kernel related constrains, it is not obvious if the user wants to violate the CPU restrictions. In terms of priority inheritance this is well defined. It terms of hyperthreading, big/little you may not want a migration. Also in term of a kernel-thread this might not be wanted, somestimes these are pinned for reason. > > From the trace, you have migration disabled for napi/can0. So that one > > can't be moved. If the lock, that is owned by candump, is a spinlock_t > > than candump has also migration disabled and can't be moved either. > > Ah, I didn't know that holding a spinlock implies disabled migration. In > the !RT case this is obvious, with RT not so?! Then boosting not only > the priority but also the set of cpus the process can run on isn't as > effective as I expected. This is unfortunate but we disable CPU migration while obtaining spinlock_t / rwlock_t for the same reason as upstream does. Dropping this part would violate kernel code. You could try to drop that and check how long it takes until the kernel falls apart ;) Nobody wanted to lift that restriction and handle things differently. This led to the part where migrate_disable() is also implemented for !RT in today kernel and used by the highmem code for example. > Best regards and thanks for your response, > Uwe Sebastian