On Fri, 3 Sep 2021 21:08:21 +0200 Dennis Filder <d.filder@xxxxxx> wrote: > Hans de Goede asked me to take a topic from a private discussion here. > I must also preface that I'm not a graphics person and my knowledge of > DRI/DRM is cursory at best. > > I initiated the conversation with de Goede after learning that the X > server now supports being started with an open DRM file descriptor > (this was added for Keith Packard's xlease project). I wondered if > that could be used to smoothen the Plymouth->X transition somehow and > asked de Goede if there were any such plans. He denied, but mentioned > that a new ioctl is in the works to prevent the kernel from wiping the > contents of a frame buffer after a device is closed, and that this > would help to keep transitions smooth. Hi, I believe the kernel is not wiping anything on device close. If something in the KMS state is wiped, it originates in userspace: - Plymouth doing something (e.g. RmFB on an in-use FB will turn the output off, you need to be careful to "leak" your FB if you want a smooth hand-over) - Xorg doing something (e.g. resetting instead of inheriting KMS state) - Something missed in the hand-off sequence which allows fbcon to momentarily take over between Plymouth and Xorg. This would need to be fixed between Plymouth and Xorg. - Maybe systemd-logind does something odd to the KMS device? It has pretty wild code there. Or maybe it causes fbcon to take over. What is the new ioctl you referred to? Being able to be started with an open DRM file descriptor is not necessary for a smooth hand-over as far as I know. There are tons of other details that are, though. > > I am a bit disappointed with this being considered a desirable way of > handling that transfer of control over a shared DRM device as it shows > a lack of ambition. Sure, it's probably easy to implement, but it Or more likely bigger fires and lack of time, like with everything. > will also greatly limit how such transitions can be presented to the > user. In practice it would mean plymouthd closing the DRM device and > exiting so that systemd can start the display manager which then > starts an X server to present the login screen. If for that several > shared libraries have to first be loaded and relocated while the > system is under heavy load then there will be a noticeable delay > manifesting as a frozen screen. After that the best you can hope for > is blending the still-frame over into the login screen (or whatever > comes then). The VT-API-based switching mechanism currently en vogue > suffers from similar limitations. All that is already solvable purely in userspace in my opinion, today. It's just a big project over several independent userspace software projects. > If the approach to transferring control were to be changed to a scheme > that involves both donor and recipient process connecting to each > other on a unix socket and actively coordinating the transfer > (i.e. the calls to drmSetMaster and drmDropMaster) then this would > open the door to a host of possibilities. Not only could the > transition be kept infinitesimally short since both processes are > already up, but it could also involve e.g. the recipient continuing an > animation the donor had going reusing state that is transferred as a > memfd. This way there wouldn't be any noticeable freezes on the > display making for a far more polished, and thus impressive > experience. It would be a feat a program alone cannot achieve on its > own. Another option made possible would be implementing a watchdog. > If the recipient transfers e.g. file descriptors for a pipe and a > pidfd of itself, then the donor could monitor those for a > heartbeat/process termination and take back control over the device if > something goes awry (deadlock/crash) and initiate a recovery > mechanism. With the other approach implementing such features is > simply not possible. Nothing in the kernel stops userspace developers from doing exactly that. Seems like you would be working on Plymouth and a display server of your choice. Don't forget to count in systemd-logind as well, since that is a popular component for managing sessions and is involved with e.g. drmSetMaster. It's a good goal. It's also more or less necessary for a smooth hand-over of a KMS device between any two processes also for other reasons I've discussed in the past with DRM developers. This is the topic about any KMS client (a program using KMS) needing to reach a guaranteed "clean" KMS state to display correctly. The kernel DRM subsystem never resets KMS state in any way, apart from driver initialisation. This means that when a new KMS client takes over, the KMS state could be anything, whatever the previous KMS client left in. This is a problem, because the KMS client may not know how to reset all the KMS properties to clean, sane defaults. Currently there is also no reset ioctl in the kernel either, and no userspace space solution for storing a sane default state. The problem arises from KMS properties: each KMS client may not know how to program all the KMS properties the kernel supports on the device. For example, if one KMS client leaves the output in HDR mode, and the next KMS client does not understand the HDR property, then quite likely the latter KMS client will display an awful image without knowing it. There is also the convention of a KMS client not restoring the inherited KMS state on exit or switch-out, because that could cause unnecessary flicker on screen and delays. This amplifies the above problem. The only time the KMS state is at "sane" defaults is right after driver initialisation. Presumably. So only the first KMS client after a reboot can expect a sane KMS state. Mind, also fbcon is a KMS client, it's just a kernel-internal one. Letting fbcon take over momentarily can reset some KMS properties to nice defaults, but not all, because when new KMS properties are added, no-one usually remembers to patch fbcon/fbdev/whatever to reset it when fbcon takes over. Switching temporarily to fbcon also causes flicker and delays. The above problems could perhaps be solved if there was a generic KMS hand-over protocol in userspace. The two KMS clients could agree on which KMS properties should be reset and by whom. Who understands and programs which KMS properties. And perhaps, some system component in early'ish boot could save the driver-initial KMS state which is presumably the good default for any use case, on the properties any particular KMS client does not know or bother to program by itself. So yes, userspace protocol for KMS hand-off would be very welcome. But who would have the time to develop it, when so far we can just limp forward with the current undocumented conventions. Such protocol, if widely used, might also make it unnecessary for KMS clients to save and restore KMS properties they do not understand when they switch out and later back in. When a KMS client has released DRM master on the device, some other KMS client could have "messed up" the KMS state, so restoring to what you used before is necessary. I don't think anyone actually implements this save/restore yet for unknown KMS properties, and it would be much easier to implement than the hand-off protocol. Maybe switching is either not done, or it is always done to/from a display manager process which sanitises the KMS state or enforces that KMS clients do not leave random state behind. Or maybe most KMS clients are just really good at agreeing on which KMS properties everyone will use. If some stale KMS property causes a problem on some KMS client (display server), it is pretty easy to just add support for programming that property in the KMS client. Problem solved and no hand-off protocol needed. Then the next KMS client hits and does the same. > Making processes talk to each other and work together like this would > also be a far more accurate software representation of what is > actually going on: different subsystems passing control over a shared > device around to work towards the common goal of a good user > experience. > > A bit of context: The idea underlying this came from my experience > with accessibility technology under Linux where uncoordinated fighting > over the audio device among all kinds of processes led to countless > ways in which things would break with no hope of ever fixing anything. > It instilled in me the conviction that user-facing programs are broken > if they are not written to talk to each other to coordinate access to > shared resources for the goal of rendering a good user experience, but > instead leave it to the distro maintainer/user to set things up into a > static, brittle working order. Seeing a much-needed cultural shift > begin somewhere would be nice. The Plymouth->X transition would lend > itself well as a starting point since many building blocks are already > there. I might recommend picking a Wayland display server instead of Xorg first. The thing is, Xorg is only a middle-man and it is some X11 client that decides what and how will be displayed. Therefore with X11 architecture, it's not just two processes that need to communicate the hand-off, it's three: Plymouth, Xorg, and the X11 client that will actually draw stuff. In Wayland architecture, you only need to communicate between Plymouth and the Wayland compositor you picked. How that Wayland compositor draws anything is an internal detail to it, so you can solve that in-project. You could also think that how Xorg gets the content is an internal detail to the X11 desktop, but that might lead to needing some new X11 protocol extension to be able to control Xorg's actions on the KMS device sufficiently. It may also be hard to get any new feature code into Xorg and released. Thanks, pq
Attachment:
pgpgLn9MZX6va.pgp
Description: OpenPGP digital signature