On Thu, 11 Apr 2024 16:33:57 -0400 Leo Li <sunpeng.li@xxxxxxx> wrote: > On 2024-04-04 10:22, Marius Vlad wrote: > > On Thu, Apr 04, 2024 at 09:59:03AM -0400, Harry Wentland wrote: > >> > > Hi all, > >> > >> On 2024-04-04 06:24, Pekka Paalanen wrote: > >>> On Wed, 3 Apr 2024 17:32:46 -0400 > >>> Leo Li <sunpeng.li@xxxxxxx> wrote: > >>> > >>>> On 2024-03-28 10:33, Pekka Paalanen wrote: > >>>>> On Fri, 15 Mar 2024 13:09:56 -0400 > >>>>> <sunpeng.li@xxxxxxx> wrote: > >>>>> > >>>>>> From: Leo Li <sunpeng.li@xxxxxxx> > >>>>>> > >>>>>> These patches aim to make the amdgpgu KMS driver play nicer with compositors > >>>>>> when building multi-plane scanout configurations. They do so by: > >>>>>> > >>>>>> 1. Making cursor behavior more sensible. > >>>>>> 2. Allowing placement of DRM OVERLAY planes underneath the PRIMARY plane for > >>>>>> 'underlay' configurations (perhaps more of a RFC, see below). > >>>>>> > >>>>>> Please see the commit messages for details. > >>>>>> > >>>>>> > >>>>>> For #2, the simplest way to accomplish this was to increase the value of the > >>>>>> immutable zpos property for the PRIMARY plane. This allowed OVERLAY planes with > >>>>>> a mutable zpos range of (0-254) to be positioned underneath the PRIMARY for an > >>>>>> underlay scanout configuration. > >>>>>> > >>>>>> Technically speaking, DCN hardware does not have a concept of primary or overlay > >>>>>> planes - there are simply 4 general purpose hardware pipes that can be maped in > >>>>>> any configuration. So the immutable zpos restriction on the PRIMARY plane is > >>>>>> kind of arbitrary; it can have a mutable range of (0-254) just like the > >>>>>> OVERLAYs. The distinction between PRIMARY and OVERLAY planes is also somewhat > >>>>>> arbitrary. We can interpret PRIMARY as the first plane that should be enabled on > >>>>>> a CRTC, but beyond that, it doesn't mean much for amdgpu. > >>>>>> > >>>>>> Therefore, I'm curious about how compositors devs understand KMS planes and > >>>>>> their zpos properties, and how we would like to use them. It isn't clear to me > >>>>>> how compositors wish to interpret and use the DRM zpos property, or > >>>>>> differentiate between OVERLAY and PRIMARY planes, when it comes to setting up > >>>>>> multi-plane scanout. > >>>>> > >>>>> You already quoted me on the Weston link, so I don't think I have > >>>>> anything to add. Sounds fine to me, and we don't have a standard plane > >>>>> arrangement algorithm that the kernel could optimize zpos ranges > >>>>> against, yet. > >>>>> > >>>>>> Ultimately, what I'd like to answer is "What can we do on the KMS driver and DRM > >>>>>> plane API side, that can make building multi-plane scanout configurations easier > >>>>>> for compositors?" I'm hoping we can converge on something, whether that be > >>>>>> updating the existing documentation to better define the usage, or update the > >>>>>> API to provide support for something that is lacking. > >>>>> > >>>>> I think there probably should be a standardised plane arrangement > >>>>> algorithm in userspace, because the search space suffers from > >>>>> permutational explosion. Either there needs to be very few planes (max > >>>>> 4 or 5 at-all-possible per CRTC, including shareable ones) for an > >>>>> exhaustive search to be feasible, or all planes should be more or less > >>>>> equal in capabilities and userspace employs some simplified or > >>>>> heuristic search. > >>>>> > >>>>> If the search algorithm is fixed, then drivers could optimize zpos > >>>>> ranges to have the algorithm find a solution faster. > >>>>> > >>>>> My worry is that userspace already has heuristic search algorithms that > >>>>> may start failing if drivers later change their zpos ranges to be more > >>>>> optimal for another algorithm. > >>>>> > >>>>> OTOH, as long as exhaustive search is feasible, then it does not matter > >>>>> how DRM drivers set up the zpos ranges. > >>>>> > >>>>> In any case, the zpos ranges should try to allow all possible plane > >>>>> arrangements while minimizing the number of arrangements that won't > >>>>> work. The absolute values of zpos are pretty much irrelevant, so I > >>>>> think setting one plane to have an immutable zpos is a good idea, even > >>>>> if it's not necessary by the driver. That is one less moving part, and > >>>>> only the relative ordering between the planes matters. > >>>>> > >>>>> > >>>>> Thanks, > >>>>> pq > >>>> > >>>> Right, thanks for your thoughts! I agree that there should be a common plane > >>>> arrangement algorithm. I think libliftoff is the most obvious candidate here. It > >>>> only handles overlay arrangements currently, but mixed-mode arrangements is > >>>> something I've been trying to look at. > >>>> > >>>> Taking the driver's reported zpos into account could narrow down the search > >>>> space for mixed arrangements. We could tell whether underlay, or overlay, or > >>>> both, is supported by looking at the allowed zpos ranges. > >>>> > >>>> I also wonder if it'll make underlay assignments easier. libliftoff has an > >>>> assumption that the PRIMARY plane has the lowest zpos (which now I realize, is > >>>> not always true). Therefore, the underlay buffer has to be placed on the > >>>> PRIMARY, with the render buffer on a higher OVERLAY. Swapping buffers between > >>>> planes when testing mixed-arrangements is kind of awkward, and simply setting > >>>> the OVERLAY's zpos to be lower or higher than the PRIMARY's sounds simpler. > >>>> > >>>> Currently only gamescope makes use of libliftoff, but I'm curious if patches > >>>> hooking it up to Weston would be welcomed? If there are other ways to have a > >>>> common arrangement algorithm, I'd be happy to hear that as well. > >>> > >>> A natural thing would be to document such an algorithm with the KMS > >>> UAPI. > >>> > >>> I don't know libliftoff well enough to say how welcome it would be in > >>> Weston. I have no fundamental or policy reason to keep an independent > >>> implementation in Weston though, so it's plausible at least. > > Is it the case that different compositors may want different plane arrangement > behaviors? Like selecting which surfaces to offload, for example? It occurred to > me that prescribing an allocation algorithm via something like libliftoff might > be too restrictive. In which case, documenting the parts that can be nailed down > would be better. I don't know. Probably there shouldn't be, eventually, because it's hard to imagine how DE or end user style/taste/preferences would affect things. Usually the global goal would be optimising power consumption. There could be trade-offs though, when reduction in power consumption results in reduced image quality or increased latency. What to favour for which surfaces is definitely policy and preference. Such preferences could perhaps be designed into libliftoff API. > That begs the question of what can be nailed down and what can left to > independent implementation. I guess things like which plane should be enabled > first (PRIMARY), and how zpos should be interpreted (overlay, underlay, mixed) > can be defined. How to handle atomic test failures could be as well. What room is there for the interpretation of zpos values? I thought they are unambiguous already: only the relative numerical order matters, and that uniquely defines the KMS plane ordering. Thanks, pq > I can start working on a draft for this. If anything, as a spark for discussions > for the display hackfest. > > >>> > >>> It would need investigation, and perhaps also extending Weston test > >>> suite a lot more towards VKMS to verify plane assignments. Currently > >>> all plane assignment testing is manual on real hardware. > >>> > >> > >> It looks like VKMS doesn't have explicit zpos yet, so someone would > >> probably need to add that. > >> > >> https://drmdb.emersion.fr/properties/4008636142/zpos > > Yes. If we look into adding that, maybe it should be done using with > > ConfigFS: https://patchwork.freedesktop.org/series/122618/ > > > > With that in and with zpos support, we could then run a batch of tests that > > can dynamically exercise on-the-fly all possible combinations. > > Using vkms to come up with a bunch of different hw plane configurations is a > good idea. It may come in handy for testing other compositors too. Thanks for > the suggestions. > > - Leo > > > >> > >> Harry > >> > >>>> Note that libliftoff's algorithm is more complex than weston, since it searches > >>>> harder, and suffers from that permutational explosion. But it solves that by > >>>> trying high benefit arrangements first (offloading surfaces that update > >>>> frequently), and bailing out once the search reaches a hard-coded deadline. > >>>> Since it's currently overlay-only, the goal could be to "simply" have no > >>>> regressions. > >>> > >>> Ensuring no regressions would indeed need to be taken care of by > >>> extending the VKMS-based automated testing. > >>> > >>> > >>> Thanks, > >>> pq > >>> > >>>>> > >>>>>> Some links to provide context and details: > >>>>>> * What is underlay?: https://gitlab.freedesktop.org/emersion/libliftoff/-/issues/76 > >>>>>> * Discussion on how to implement underlay on Weston: https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1258#note_2325164 > >>>>>> > >>>>>> Cc: Joshua Ashton <joshua@xxxxxxxxx> > >>>>>> Cc: Michel Dänzer <mdaenzer@xxxxxxxxxx> > >>>>>> Cc: Chao Guo <chao.guo@xxxxxxx> > >>>>>> Cc: Xaver Hugl <xaver.hugl@xxxxxxxxx> > >>>>>> Cc: Vikas Korjani <Vikas.Korjani@xxxxxxx> > >>>>>> Cc: Robert Mader <robert.mader@xxxxxxxxx> > >>>>>> Cc: Pekka Paalanen <pekka.paalanen@xxxxxxxxxxxxx> > >>>>>> Cc: Sean Paul <sean@xxxxxxxxxx> > >>>>>> Cc: Simon Ser <contact@xxxxxxxxxxx> > >>>>>> Cc: Shashank Sharma <shashank.sharma@xxxxxxx> > >>>>>> Cc: Harry Wentland <harry.wentland@xxxxxxx> > >>>>>> Cc: Sebastian Wick <sebastian.wick@xxxxxxxxxx> > >>>>>> > >>>>>> Leo Li (2): > >>>>>> drm/amd/display: Introduce overlay cursor mode > >>>>>> drm/amd/display: Move PRIMARY plane zpos higher > >>>>>> > >>>>>> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 405 ++++++++++++++++-- > >>>>>> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 7 + > >>>>>> .../amd/display/amdgpu_dm/amdgpu_dm_crtc.c | 1 + > >>>>>> .../amd/display/amdgpu_dm/amdgpu_dm_plane.c | 28 +- > >>>>>> 4 files changed, 391 insertions(+), 50 deletions(-) > >>>>>> > >>>>> > >>> > >>
Attachment:
pgpARCMJFaIx5.pgp
Description: OpenPGP digital signature