Hi, On Thu, Sep 28, 2023 at 2:56 AM Christian König <christian.koenig@xxxxxxx> wrote: > > To say the "whole point" is about CPU overhead accounting sounds > > rather absurd to me. Is that really what you meant? > > Yes, absolutely. See the functionality you try to implement already exists. You say lower in this same message that you don't believe the functionality actually works for the dpms off case I mentioned. > After making a non blocking commit userspace can still wait on the > commit to finish by looking at the out fence. fences, not fence, fences. drmModeAtomicCommit works on multiple objects at the same time. To follow the spirit of such an api, we would need a separate fd allocated for each crtc and would have to wait on all of them. Surely you can see how that is much less straightforward than using a blocking api. But mutter already uses non-blocking apis for the lion's share of cases. It doesn't need fences for those cases, though, because it can just use page flip events. The main reason it uses blocking apis are for modesets and when doing dpms off. The latter case you said you don't think can use fences, and it certainly can't use page flip events. So if you're right that fences can't be used for the dpms off case, it's not workable answer. If you're wrong, and fences can be used for the dpms off case, then it's a messy answer. > A blocking system call in the sense of RLIMIT_RTTIME means something > which results in the process listening for external events, e.g. calling > select(), epoll() or read() on (for example) a network socket etc... > > As far as I can see drmAtomicCommit() is *not* meant with that what > similar to for example yield() also doesn't reset the RLIMIT_RTTIME counter. No no no, drmModeAtomicCommit() is not like sched_yield(). That's a really strange thing to say (you do mean sched_yield() right?). sched_yield() is an oddball because it's specifically for giving other threads a turn if they need it without causing the current thread to sleep if they don't. It's a niche api that's meant for high performance use cases. It's a way to reduce scheduling latency and increase running time predictability. drmModeAtomicCommit() using up rt time, busy looping while waiting on the hardware to respond, eating into userspace RLIMIT_RTTIME is nothing like that. I'm getting the idea that you think there is some big bucket of kernel syscalls that block for a large fraction of a second by design and are not meant to reset RLIMIT_RTTIME. I could be wrong, but I don't think that's true. Off the top of my head, the only ones I can think of that might reasonably do that are futex() (which obviously can't sleep), sched_yield() (who's whole point is to not sleep), and maybe a some obscure ioctls (some probably legitimately, some probably illegitimately and noone has noticed.). I'm willing to be proven wrong here, and I might be, but right now from thinking about it, my guess is the above list is pretty close to complete. > Well you are breaking the RLIMIT_RTTIME functionality. > > The purpose of that functionality is to allow debugging and monitoring > applications to make sure that they keep alive and can react to external > signals. I don't think you really thought through what you're saying here. It just flatly doesn't apply for drmModeAtomicCommit. What is an application supposed to do? It can't block the SIGKILL that's coming. Respond to the preceding SIGXCPUs? What response could the application possibly make? I'm guessing drmModeAtomicCommit isn't going to EINTR because it's busy looping waiting on hardware in the process context. And the kernel doesn't even guarantee SIGXCPU is going to go to the thread with the stuck syscall, so even if there was a legitimate response (or even "pthread_cancel" or some wreckless nonsense like that), getting the notification to the right part of the program is an exercise in gymnastics. > From the RLIMIT_RTTIME documentation: "The intended use of this limit > is to stop a runaway real-time process from locking up the system." > > And when drmAtomicCommit() is triggering this then we either have a > problem with the application doing something it is not supposed to do > (like blocking for vblank while it should listen to network traffic) or > the driver is somehow buggy. drmModeAtomicCommit() is used by display servers. If drmModeAtomicCommit runs away in e.g. a set of 100ms busy loops responding to a confused or slow responding GPU, the system will seemingly lock up to the user. That is an intractable problem that we can not get away from. It doesn't matter if the kernel is stuck in process context or on a workqueue. And, regardless, it's not reasonable to expect userspace to craft elaborate workarounds for driver bugs. We just have to fix the bugs. > No when you disable everything there is of course no fence allocated. Okay, so it's not actually an answer > But then you also should never see anything waiting for to long to > actually be able to trigger the RLIMIT_RTTIME. But we do. That's the problem. That's like the whole problem. The amdgpu driver thinks it's okay to do something like: for_each_command_in_queue(&queue, command) { execute_command(command); while (1) { wait_for_response(); if (delay++ > 100000) break; udelay(1); } } or something like that. all while keeping the process in the RUNNABLE state. It just seems wrong to me. At least let the userspace process get scheduled out. > > Regardless, this seems like a roundabout way to address a problem that > > we could just ... fix. > > Well to make it clear: This is not a problem but intended behavior! I'm going to be frank, I don't think this was intended behavior. We can wait for sima to get off PTO and chime in, to know one way or the other (or maybe airlied can chime in with his take?), but I doubt he was thinking about realtime scheduling minutiae when he put together the drmModeAtomicCommit implementation. There's no practical reason for doing things the way they're being done as far as I can tell. --Ray