Hi, On Fri, Oct 6, 2023 at 3:12 AM Christian König <christian.koenig@xxxxxxx> wrote: > When the operation busy waits then that *should* get accounted to the > CPU time of the current process. When the operation sleeps and waits for > some interrupt for example it should not get accounted. > What you suggest is to put the parts of the operation which busy wait > into a background task and then sleep for that task to complete. This is > not a solution to the problem, but just hides it. Actually, I think we both probably agree that there shouldn't be long busy waits in the context of the current process. After all, we both agree what the AMD DC driver code is doing is wrong. To be clear, my take is, if driver code is running in process context and needs to wait for periods of time on the order of or in excess of a typical process time slice it should be sleeping during the waiting. If the operation is at a point where it can be cancelled without side effects, the sleeping should be INTERRUPTIBLE. If it's past the point of no return, sleeping should be UNINTERRUPTIBLE. At no point, in my opinion, should kernel code busy block a typical process for dozens of milliseconds while keeping the process RUNNING. I don't think this is a controversial take. Actually, I think (maybe?) you might even agree with that, but you're also saying: user space processes aren't special here. While it's not okay to busy block them, it's also not okay to busy block on the system unbound workqueue either. If that's your sentiment, I don't disagree with it. So I think we both agree the busy waiting is a problem, but maybe we disagree on the best place for the problem to manifest when it happens. One thought re the DC code is regardless of where the code is running, the scheduler is going to forcefully preempt it at some point right? Any writereg/udelay(1)/readreg loop is going to get disrupted by a much bigger than 1us delay by the kernel if the loop goes on long enough. I'm not wrong about that? if that's true, the code might as well switch out the udelay(1) for a usleep(1) and call it a day (well modulo the fact I think it can be called from an interrupt handler; at least "git grep" says there's a link_set_dpms_off in link_dp_irq_handler.c) > Stuff like that is not a valid justification for the change. Ville > changes on the other hand tried to prevent lock contention which is a > valid goal here. Okay so let's land his patchset! (assuming it's ready to go in). Ville, is that something you'd want to resend for review? Note, a point that I don't think has been brought up yet, too, is the system unbound workqueue doesn't run with real time priority. Given the lion's share of mutter's drmModeAtomicCommit calls are nonblock, and so are using the system unbound workqueue for handling the commits, even without this patch, that somewhat diminishes the utility of using a realtime thread anyway. I believe the original point of the realtime thread was to make sure mouse cursor motion updates are as responsive as possible, because any latency there is something the user really feels. Maybe there needs to be a different mechanism in place to make sure user perceived interactivity is given priority. --Ray