Hi, On Mon, Oct 09, 2023 at 02:36:17PM +0200, Christian König wrote: > > > > To be clear, my take is, if driver code is running in process context > > > > and needs to wait for periods of time on the order of or in excess of > > > > a typical process time slice it should be sleeping during the waiting. > > > > If the operation is at a point where it can be cancelled without side > > > > effects, the sleeping should be INTERRUPTIBLE. If it's past the point > > > > of no return, sleeping should be UNINTERRUPTIBLE. At no point, in my > > > > opinion, should kernel code busy block a typical process for dozens of > > > > milliseconds while keeping the process RUNNING. I don't think this is > > > > a controversial take. > > > Exactly that's what I completely disagree on. Okay if we can't agree that it's not okay for user space (or the kernel running in the context of user space) to busy loop a cpu core at 100% utilization throughout and beyond the process's entire scheduled time slice then we really are at an impasse. I gotta say i'm astonished that this seemingly indefensible behavior is somehow a point of contention, but I'm not going to keep arguing about it beyond this email. I mean we're not talking about scientific computing, or code compilation, or seti@home. We're talking about nearly the equivalent of `while (1) __asm__ ("nop");` > > The key point here is that the patch puts the work into the background just > > to avoid that it is accounted to the thread issuing it, and that in turn is > > not valid as far as I can see. > > Yeah it's that aspect I'm really worried about, because we essentially > start to support some gurantees that a) most drivers can't uphold without > a huge amount of work, some of the DC state recomputations are _really_ > expensive b) without actually making the semantics clear, it's just > duct-tape. If DC plane state computation (or whatever) is really taking 50ms or 200ms, then it probably should be done in chunks so it doesn't get preempted at an inopportune point? Look, this is not my wheelhouse, this is your wheelhouse, and I don't want to keep debating forever. It seems there is a discrepancy between our understandings of implied acceptable behavior. > Yes compositors want to run kms in real-time, and yes that results in fun > if you try to strictly account for cpu time spent. Especially if your > policy is to just nuke the real time thread instead of demoting it to > SCHED_NORMAL for a time. So I ended up going with this suggestion for blocking modesets: https://gitlab.gnome.org/GNOME/mutter/-/commit/5d3e31a49968fc0da04e98c0f9d624ea5095c9e0 But *this* feels like duct tape: You've already said there's no guarantee the problem won't also happen during preliminary computation during non-blocking commits or via other drm entry points. So it really does seem like a fix that won't age well. I won't be surprised if in ~3 years (or whatever) in some RHEL release there's a customer bug leading to the real-time thread getting blocklisted for obscure server display hardware because it's causing the session to tank on a production machine. > I think if we want more than hacks here we need to answer two questions: > - which parts of the kms api are real time > - what exactly do we guarantee with that imo, this isn't just about real-time versus non-real-time. It's no more acceptable for non-real-time mutter to be using 100% CPU doing busywaits than it is for real-time mutter to be using 100% cpu doing busywaits. Also, both you and Christian have suggested using the non-blocking modeset api with a fence fd to poll on is equivalent to the blocking api flushing the commit_tail work before returning from the ioctl, but that's not actually true. I think we all now agree the EBUSY problem you mentioned as an issue with my proposed patch wasn't actually a problem for blocking commits, but that very same issue is a problem with the non-blocking commits that then block on a fence fd, right? I guess we'd need to block on a fence fd from the prior non-blocking commit first before starting the blocking commit (or something) --Ray