On Mon, Oct 09, 2023 at 08:42:24AM +0200, Christian König wrote: > Am 06.10.23 um 20:48 schrieb Ray Strode: > > Hi, > > > > On Fri, Oct 6, 2023 at 3:12 AM Christian König <christian.koenig@xxxxxxx> wrote: > >> When the operation busy waits then that *should* get accounted to the > >> CPU time of the current process. When the operation sleeps and waits for > >> some interrupt for example it should not get accounted. > >> What you suggest is to put the parts of the operation which busy wait > >> into a background task and then sleep for that task to complete. This is > >> not a solution to the problem, but just hides it. > > Actually, I think we both probably agree that there shouldn't be long > > busy waits in the context of the current process. After all, we both > > agree what the AMD DC driver code is doing is wrong. > > > > To be clear, my take is, if driver code is running in process context > > and needs to wait for periods of time on the order of or in excess of > > a typical process time slice it should be sleeping during the waiting. > > If the operation is at a point where it can be cancelled without side > > effects, the sleeping should be INTERRUPTIBLE. If it's past the point > > of no return, sleeping should be UNINTERRUPTIBLE. At no point, in my > > opinion, should kernel code busy block a typical process for dozens of > > milliseconds while keeping the process RUNNING. I don't think this is > > a controversial take. > > Exactly that's what I completely disagree on. > > When the driver is burning CPU cycles on behalves of a process then > those CPU cycles should be accounted to the process causing this. > > That the driver should probably improve it's behavior is a different issue. > > > Actually, I think (maybe?) you might even agree with that, but you're > > also saying: user space processes aren't special here. While it's not > > okay to busy block them, it's also not okay to busy block on the > > system unbound workqueue either. If that's your sentiment, I don't > > disagree with it. > > No, it's absolutely ok to busy block them it's just not nice to do so. > > As Daniel pointed out this behavior is not incorrect at all. The DRM > subsystem doesn't make any guarantee that drmModeAtomicCommit() will not > burn CPU cycles. > > > > > So I think we both agree the busy waiting is a problem, but maybe we > > disagree on the best place for the problem to manifest when it > > happens. > > > > One thought re the DC code is regardless of where the code is running, > > the scheduler is going to forcefully preempt it at some point right? > > Any writereg/udelay(1)/readreg loop is going to get disrupted by a > > much bigger than 1us delay by the kernel if the loop goes on long > > enough. I'm not wrong about that? if that's true, the code might as > > well switch out the udelay(1) for a usleep(1) and call it a day (well > > modulo the fact I think it can be called from an interrupt handler; at > > least "git grep" says there's a link_set_dpms_off in > > link_dp_irq_handler.c) > > > >> Stuff like that is not a valid justification for the change. Ville > >> changes on the other hand tried to prevent lock contention which is a > >> valid goal here. > > Okay so let's land his patchset! (assuming it's ready to go in). > > Ville, is that something you'd want to resend for review? > > Well, while Ville patch has at least some justification I would still > strongly object to move the work into a background thread to prevent > userspace from being accounted for the work it causes. Aren't most wayland compositors using nonblocking commits anyway? If so they would already be bypassing proper CPU time accounting. Not saying we shouldn't try to fix that, but just pointing out that it already is an issue with nonblocking commits. -- Ville Syrjälä Intel