On Thu, 6 Oct 2022 12:03:56 +0000 "Hoosier, Matt" <Matt.Hoosier@xxxxxxxxxx> wrote: > I have a DRM master implementing a purpose-built compositor for a > dedicated use-case. It drives several different connectors, each on > its own vsync cadence (there's no clone mode happening here). > > The goal is to have commits to each connector occur completely > without respect to whatever is happening on the other connectors. > There's a different thread issuing the DRI ioctl's for each connector. > > In the compositor, each connector is treated like its own little > universe; a disjoint set of CRTCs and planes is earmarked for use by > each of the connectors. One intention for this is to avoid sharing > resources in a way that would introduce implicit synchronization > points between the two connector's event loops. So, atomic commits > made to one connector never attempt to use a resource that's ever > been used in a commit to a different connector. This may be relevant > to a question I'll ask a bit later below about resource locking > contention. > > For some time, I've been noticing that even test-only atomic commits > done on connector A will sometimes block for many frame-times. > Analysis with the DRI driver implementor has shown that the atomic > commits to A--whether DRM_MODE_ATOMIC_TEST_ONLY or > DRM_MODE_ATOMIC_NONBLOCK--are getting stuck in the ioctl entry code > waiting for a DRI mutex. > > It turns out that during these unexpected delays, the DRI driver's > commit thread holds that mutex while servicing a commit to connector > B. It does this while it waits for the fences to fire for all > framebuffer IDs referred to by the pending connector B scene. So the > commit to connector A can't be tested or enqueued until the commit to > B is completely finished. The driver author reckons that this is > unavoidable because every DRM_IOCTL_MODE_ATOMIC ioctl needs to > acquire the same global singleton DRM connection_mutex in order to > query or manipulate the connector. > > The result is that it's quite difficult to guarantee a framerate on > connector A, because unrelated activity performed on connector B can > hold global locks for an unpredictable amount of time. > > The first question would be: does this story sound consistent? If so, > then a couple more questions follow. > > Is this kind of implicit interlocking expected? Is there any way to > avoid the pending commits getting serialized like that on the kernel > side? Hi Matt, Ville actually mentioned something very much like that recently, see the thread at: https://lore.kernel.org/dri-devel/20220916163331.6849-1-ville.syrjala@xxxxxxxxxxxxxxx/ If even non-blocking commits can stall test-only commits, that could be a problem for Weston too. Weston being single-threaded wouldn't help. Thanks, pq
Attachment:
pgpZv7MNmCyW1.pgp
Description: OpenPGP digital signature