On Thu, Oct 05, 2023 at 01:16:27PM +0300, Ville Syrjälä wrote: > On Thu, Oct 05, 2023 at 11:57:41AM +0200, Daniel Vetter wrote: > > On Tue, Sep 26, 2023 at 01:05:49PM -0400, Ray Strode wrote: > > > From: Ray Strode <rstrode@xxxxxxxxxx> > > > > > > A drm atomic commit can be quite slow on some hardware. It can lead > > > to a lengthy queue of commands that need to get processed and waited > > > on before control can go back to user space. > > > > > > If user space is a real-time thread, that delay can have severe > > > consequences, leading to the process getting killed for exceeding > > > rlimits. > > > > > > This commit addresses the problem by always running the slow part of > > > a commit on a workqueue, separated from the task initiating the > > > commit. > > > > > > This change makes the nonblocking and blocking paths work in the same way, > > > and as a result allows the task to sleep and not use up its > > > RLIMIT_RTTIME allocation. > > > > > > Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2861 > > > Signed-off-by: Ray Strode <rstrode@xxxxxxxxxx> > > > > So imo the trouble with this is that we suddenly start to make > > realtime/cpu usage guarantees in the atomic ioctl. That's a _huge_ uapi > > change, because even limited to the case of !ALLOW_MODESET we do best > > effort guarantees at best. And some drivers (again amd's dc) spend a ton > > of cpu time recomputing state even for pure plane changes without any crtc > > changes like dpms on/off (at least I remember some bug reports about > > that). And that state recomputation has to happen synchronously, because > > it always influences the ioctl errno return value. > > > > My take is that you're papering over a performance problem here of the > > "the driver is too slow/wastes too much cpu time". We should fix the > > driver, if that's possible. > > > > Another option would be if userspace drops realtime priorities for these > > known-slow operations. And right now _all_ kms operations are potentially > > cpu and real-time wasters, the entire uapi is best effort. > > > > We can also try to change the atomic uapi to give some hard real-time > > guarantees so that running compositors as SCHED_RT is possible, but that > > - means a very serious stream of bugs to fix all over > > - therefore needs some very wide buy-in from drivers that they're willing > > to make this guarantee > > - probably needs some really carefully carved out limitations, because > > there's imo flat-out no way we'll make all atomic ioctl hard time limit > > bound > > > > Also, as König has pointed out, you can roll this duct-tape out in > > userspace by making the commit non-blocking and immediately waiting for > > the fences. > > > > One thing I didn't see mention is that there's a very subtle uapi > > difference between non-blocking and blocking: > > - non-blocking is not allowed to get ahead of the previous commit, and > > will return EBUSY in that case. See the comment in > > drm_atomic_helper_commit() > > - blocking otoh will just block until any previous pending commit has > > finished > > > > Not taking that into account in your patch here breaks uapi because > > userspace will suddenly get EBUSY when they don't expect that. > > The -EBUSY logic already checks whether the current commit is > non-blocking vs. blocking commit, so I don't see how there would > be any change in behaviour from simply stuffing the commit_tail > onto a workqueue, especially as the locks will be still held across > the flush. Hm right, I forgot the patch context when I was chasing the EBUSY logic, I thought it just pushed a nonblocking commit in somehow. > In my earlier series [1] where I move the flush to happen after dropping > the locks there is a far more subtle issue because currently even > non-blocking commits can actually block due to the mutex. Changing > that might break something, so I preserved that behaviour explicitly. > Full explanation in the first patch there. > > [1] https://patchwork.freedesktop.org/series/108668/ Yeah there's a can of tricky details here for sure ... -Sima -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch