On Wed, Jul 4, 2018 at 5:44 PM, Daniel Stone <daniel@xxxxxxxxxxxxx> wrote: > Hi, > The atomic API being super-explicit about how userspace sequences its > calls is great and all, but having shared global state implicitly > dragged in is kind of ruining my day. > > Currently on Intel, Weston sometimes fails on hotplug, because a > commit which only enables CRTC B (not touching CRTC A or any other > CRTC!), causes all commits to CRTC A to fail until CRTC B's modeset > commit has fully retired: > https://gitlab.freedesktop.org/wayland/weston/issues/24 > > The reason is that committing CRTC B resizes the DDB allocation for > CRTC A as well, pulling CRTC A's CRTC state into the commit. This > makes sense, but on the other hand it's totally opaque to userspace, > and impossible for us to reason about when making commits. > > I suggested some options in that GitLab commit, none of which I like: > * if any other CRTCs are pulled into a commit state, always execute > a blocking commit in the kernel > * if we're passing ALLOW_MODESET in userspace, only ever do blocking commits > * whenever we get -EBUSY in userspace, assume we've been screwed by > the kernel and defer until other outputs have completed > * whenever we want to reconfigure any output in userspace, wait > until all outputs are completely quiescent and do a single atomic > commit covering all outputs > > The first one seems completely non-obvious from the kernel, but on the > other hand the current -EBUSY failing behaviour is also non-obvious. > > The second is maybe the most reasonable, but on the other hand just > working around a painful leaky abstraction: we also can't know upfront > from userspace if this is actually going to be required, or if we're > just killing responsiveness blocking for no reason. > > The third is the thing I least want to do, because it might well paper > over legitimate bugs in userspace, and complicates our state tracking > for no reason. > > The fourth is probably the most legitimate, but, well ... someone has > to type up all the code to make our output-configuration API > completely asynchronous. > > I suspect we're the first ones to be hitting this, because Weston has > a truly independent per-CRTC repaint loop, we're one of the few atomic > users, and also because Pekka did some seriously brutal hotplug > testing whilst reworking Weston's output configuration API. Also > because our approach to failed output repaints is to just freeze the > output until it next cycles off and on, which is much more apparent > than just silently dropping a frame here and there. ;) > > Any bright ideas on what could practically be done here? > > Cheers, > Daniel We had an entirely inconclusive chat on irc about this topic. Random notes: - This can fail both on the commit doing a modeset (when it adds some random other CRTC which still has a pending flip), and for normal flips (if they run into a CRTC which has been affected by a modeset). - Just waiting for all nonblocking modesets to complete is not enough, because you don't get an event for these affected CRTC. And they might take longer. So you can still get an EBUSY even when it looks like everything is done. - If you don't combine nonblocking with ALLOW_MODESET there's no issue, because everyone just blocks (and maybe misses a few frames) until the modeset is done. - We want to tell apart an EBUSY due to the redraw loop having lost sync from an EBUSY due to these modesets that affect more than just the passed-in CRTCs. It's a really nice debug feature for compositor developers. -> One way or another we need new/clarified uapi, atomic as implemented doesn't really cut it. A few of the options we've discussed: - Eating all the EBUSY for ALLOW_MODESET. A bit a bigger hammer. Slightly smaller hammer is just eating all the EBUSY for crtc not in the original request. It would still be nice to tell userspace about the additional CRTCs its atomic comit affects. - Some new flags to send out events (what about fence_ptr?) for these affected crtcs, plus reporting them out in a new affected_crtcs_mask in the atomic ioctl struct. - One tricky issue is that drivers might pull in CRTCs which are disabled and stay disabled. Events aren't well-defined on these at all, but they might still provoke an EBUSY! - Compositors would like to know how many frames they've lost. One idea was to split up the retry error codes: EINTR would mean retrying immedetialy due to a signal. EAGAIN would mean the kernel had to insert something to make a modeset possible, retry only after you waited for one vblank. With current userspace this would only result in cpu wasted, but updated userspace could be more intelligent about things. We could also do a special flag or whatever. We didn't zero in on anything specific since it's not really clear what compositors want here at all. Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx