Am 28.09.23 um 14:46 schrieb Ray
Strode:
Hi, On Thu, Sep 28, 2023 at 2:56 AM Christian König <christian.koenig@xxxxxxx> wrote:To say the "whole point" is about CPU overhead accounting sounds rather absurd to me. Is that really what you meant?Yes, absolutely. See the functionality you try to implement already exists.You say lower in this same message that you don't believe the functionality actually works for the dpms off case I mentioned.
well in the dpms off case there shouldn't be any blocking as far as I understand it.
If you see a large delay in the dpms off case then we probably have a driver bug somewhere.
[SNIP]A blocking system call in the sense of RLIMIT_RTTIME means something which results in the process listening for external events, e.g. calling select(), epoll() or read() on (for example) a network socket etc... As far as I can see drmAtomicCommit() is *not* meant with that what similar to for example yield() also doesn't reset the RLIMIT_RTTIME counter.No no no, drmModeAtomicCommit() is not like sched_yield(). That's a really strange thing to say (you do mean sched_yield() right?). sched_yield() is an oddball because it's specifically for giving other threads a turn if they need it without causing the current thread to sleep if they don't. It's a niche api that's meant for high performance use cases. It's a way to reduce scheduling latency and increase running time predictability. drmModeAtomicCommit() using up rt time, busy looping while waiting on the hardware to respond, eating into userspace RLIMIT_RTTIME is nothing like that. I'm getting the idea that you think there is some big bucket of kernel syscalls that block for a large fraction of a second by design and are not meant to reset RLIMIT_RTTIME.
Yes, exactly that's the case as far as I know.
The purpose of RLIMIT_RTTIME is to have a watchdog if an application is still responsive. Only things which make the application look for outside events are considered a reset for this whatdog.
So for example calling select() and poll() will reset the watchdog, but not calling any DRM modeset functions or an power management IOCTL for example.
There are only two possibilities I can see how a DRM modeset is triggering this:
1. We create enough CPU overhead to actually exceed the limit. In this case triggering it is certainly intentional.
2. We busy wait for the hardware to reach a certain state. If that is true then this is a bug in the driver.
I could be wrong, but I don't think that's true. Off the top of my head, the only ones I can think of that might reasonably do that are futex() (which obviously can't sleep), sched_yield() (who's whole point is to not sleep), and maybe a some obscure ioctls (some probably legitimately, some probably illegitimately and noone has noticed.). I'm willing to be proven wrong here, and I might be, but right now from thinking about it, my guess is the above list is pretty close to complete.Well you are breaking the RLIMIT_RTTIME functionality. The purpose of that functionality is to allow debugging and monitoring applications to make sure that they keep alive and can react to external signals.I don't think you really thought through what you're saying here. It just flatly doesn't apply for drmModeAtomicCommit. What is an application supposed to do?
Get terminated and restart. That's what the whole functionality is good for.
If you don't desire that than don't use the RLIMIT_RTTIME functionality.
It can't block the SIGKILL that's coming. Respond to the preceding SIGXCPUs? What response could the application possibly make? I'm guessing drmModeAtomicCommit isn't going to EINTR because it's busy looping waiting on hardware in the process context. And the kernel doesn't even guarantee SIGXCPU is going to go to the thread with the stuck syscall, so even if there was a legitimate response (or even "pthread_cancel" or some wreckless nonsense like that), getting the notification to the right part of the program is an exercise in gymnastics.From the RLIMIT_RTTIME documentation: "The intended use of this limit is to stop a runaway real-time process from locking up the system." And when drmAtomicCommit() is triggering this then we either have a problem with the application doing something it is not supposed to do (like blocking for vblank while it should listen to network traffic) or the driver is somehow buggy.drmModeAtomicCommit() is used by display servers. If drmModeAtomicCommit runs away in e.g. a set of 100ms busy loops responding to a confused or slow responding GPU, the system will seemingly lock up to the user. That is an intractable problem that we can not get away from. It doesn't matter if the kernel is stuck in process context or on a workqueue. And, regardless, it's not reasonable to expect userspace to craft elaborate workarounds for driver bugs. We just have to fix the bugs.
Exactly that's the reason why I'm pointing out that this isn't a good idea.
No when you disable everything there is of course no fence allocated.Okay, so it's not actually an answerBut then you also should never see anything waiting for to long to actually be able to trigger the RLIMIT_RTTIME.But we do. That's the problem. That's like the whole problem. The amdgpu driver thinks it's okay to do something like: for_each_command_in_queue(&queue, command) { execute_command(command); while (1) { wait_for_response(); if (delay++ > 100000) break; udelay(1); } } or something like that. all while keeping the process in the RUNNABLE state. It just seems wrong to me. At least let the userspace process get scheduled out.
That just papers over the problem. The process doesn't become more responsive by hiding the problem.
What you need to do here is to report those problems to the driver teams and not try to hide them this way.
Regards,
Christian.
Regardless, this seems like a roundabout way to address a problem that we could just ... fix.Well to make it clear: This is not a problem but intended behavior!I'm going to be frank, I don't think this was intended behavior. We can wait for sima to get off PTO and chime in, to know one way or the other (or maybe airlied can chime in with his take?), but I doubt he was thinking about realtime scheduling minutiae when he put together the drmModeAtomicCommit implementation. There's no practical reason for doing things the way they're being done as far as I can tell. --Ray