Re: [PATCH 0/2] drm/amdgpu/display: Make multi-plane configurations more flexible

Leo Li <sunpeng.li@xxxxxxx> · Mon, 15 Apr 2024 18:33:39 -0400

On 2024-04-15 04:19, Pekka Paalanen wrote:
On Fri, 12 Apr 2024 16:14:28 -0400
Leo Li <sunpeng.li@xxxxxxx> wrote:

On 2024-04-12 11:31, Alex Deucher wrote:
On Fri, Apr 12, 2024 at 11:08 AM Pekka Paalanen
<pekka.paalanen@xxxxxxxxxxxxx> wrote:

On Fri, 12 Apr 2024 10:28:52 -0400
Leo Li <sunpeng.li@xxxxxxx> wrote:

On 2024-04-12 04:03, Pekka Paalanen wrote:
On Thu, 11 Apr 2024 16:33:57 -0400
Leo Li <sunpeng.li@xxxxxxx> wrote:

...

That begs the question of what can be nailed down and what can left to
independent implementation. I guess things like which plane should be enabled
first (PRIMARY), and how zpos should be interpreted (overlay, underlay, mixed)
can be defined. How to handle atomic test failures could be as well.

What room is there for the interpretation of zpos values?

I thought they are unambiguous already: only the relative numerical
order matters, and that uniquely defines the KMS plane ordering.

The zpos value of the PRIMARY plane relative to OVERLAYS, for example, as a way
for vendors to communicate overlay, underlay, or mixed-arrangement support. I
don't think allowing OVERLAYs to be placed under the PRIMARY is currently
documented as a way to support underlay.

I always thought it's obvious that the zpos numbers dictate the plane
order without any other rules. After all, we have the universal planes
concept, where the plane type is only informational to aid heuristics
rather than defining anything.

Only if the zpos property does not exist, the plane types would come
into play.

Of course, if there actually exists userspace that fails if zpos allows
an overlay type plane to be placed below primary, or fails if primary
zpos is not zero, then DRM needs a new client cap.

Right, it wasn't immediately clear to me that the API allowed placement of
things beneath the PRIMARY. But reading the docs for drm_plane_create_zpos*,
there's nothing that forbids it.

libliftoff for example, assumes that the PRIMARY has the lowest zpos. So
underlay arrangements will use an OVERLAY for the scanout plane, and the PRIMARY
for the underlay view.

That's totally ok. It works, right? Plane type does not matter if the
KMS driver accepts the configuration.

What is a "scanout plane"? Aren't all KMS planes by definition scanout
planes?

Pardon my terminology, I thought the scanout plane was where weston rendered
non-offloadable surfaces to. I guess it's more correct to call it the "render
plane". On weston, it seems to be always assigned to the PRIMARY.

The assignment restriction is just technical design debt. It is
limiting. There is no other good reason for it, than when lighting
up a CRTC for the first time, Weston should do it with the renderer FB
only, on the plane that is most likely to succeed i.e. PRIMARY. After
the CRTC is lit, there should be no built-in limitations in what can go
where.

The reason for this is that if a CRTC can be activated, it must always
be able to show the renderer FB without incurring a modeset. This is
important for ensuring that the fallback compositing (renderer) is
always possible. So we start with that configuration, and everything
else is optional bonus.

Genuinely curious - What exactly is limiting with keeping the renderer FB on
PRIMARY? IOW, what is the additional benefit of placing the renderer FB on
something other than PRIMARY?

For libliftoff, using OVERLAYs as the render plane and PRIMARY as the underlay
plane would work. But I think keeping the render plane on PRIMARY (a la weston)
makes underlay arrangements easier to allocate, and would be nice to incorporate
into a shared algorithm.

If zpos exists, I don't think such limitation is a good idea. It will
just limit the possible configurations for no reason.

With zpos, the KMS plane type should be irrelevant for their
z-ordering. Underlay vs. overlay completely loses its meaning at the
KMS level.

Right, the plane types loose their meanings. But at least with the way
libliftoff builds the plane arrangement, where we first allocate the renderer fb
matters.

libliftoff incrementally builds the atomic state by adding a single plane to the
atomic state, then testing it. It essentially does a depth-first-search of all
possible arrangements, pruning the search on atomic test fail. The state that
offloads the most number of FBs will be the arrangement used.

Of course, it's unlikely that the entire DFS tree will traversed in time for a
frame. So the key is to search the most probable and high-benefit branches
first, while minimizing the # of atomic tests needed, before a hard-coded
deadline is hit.

Following this algorithm, the PRIMARY needs to be enabled first, followed by all
the secondary planes. After a plane is enabled, it's not preferred to change
it's assigned FB, since that can cause the state to be rejected (in actuality,
not just the FB, but also any color and transformation stuffs associated with
the surface). It is preferable to build on the state by enabling another
fb->plane. This is where changing a plane's zpos to be above/below the PRIMARY
is advantageous, rather than changing the FBs assigned, to accommodate
overlay/underlay arrangements.

I imagine that any algorithm which incrementally builds up the plane arrangement
will have a similar preference. Of course, it's entirely possible that such an
algorithm isn't the best, I admittedly have not thought much about other
possibilities, yet...

Thanks,
Leo

In an underlay arrangement, pushing down an OVERLAY's zpos below the PRIMARY's
zpos is simpler than swapping their surfaces. If such an arrangement fails
atomic_test, we won't have to worry about swapping the surfaces back. Of course,
it's not that we can't keep track of that in the algorithm, but I think it does
make things easier.

There is no "swapping" or "swapping back". The tentative configuration
is created as a new object that contains the complete CRTC+connector
state, and if it doesn't work, it's simply destroyed. In Weston at
least, I don't know of libliftoff.

One surface could also be assigned to multiple KMS planes for different
CRTCs, so there should be no 1:1 association in the first place.

It may help with reducing the amount of atomic tests. Assuming that the same DRM
plane provides the same format/color management/transformation support
regardless of it's zpos,

I would definitely expect so.

we should be able to reasonably expect that changing
it's z-ordering will not cause atomic_test failures (or at least, expect less
causes for failure). In other words, swapping the render plane from the PRIMARY
to an OVERLAY might have more causes for an atomic_test fail, versus changing
their z-ordering. The driver might have to do more things under-the-hood to
provide this consistent behavior, but I think that's the right place for it.
After all, drivers should know more about their hardware's behavior.

Indeed.

The assumption that the PRIMARY has the lowest zpos isn't always true. I
was made aware that the imx8mq platform places all of their OVERLAYS beneath the
PRIMARY. Granted, the KMS code for enabling OVERLAYS is not upstream yet, but it
is available from this thread:
https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1258#note_2319898
. I guess this is more of a bad assumption that should be fixed in libliftoff.

Weston needs fixing too, at least in case a renderer FB is used on the
CRTC. Weston has two problems: renderer FB is always on PRIMARY plane,
and renderer FB is always completely opaque.

Thanks,
pq

IOW, if the KMS client understands zpos and can do a proper KMS
configuration search, and all planes have zpos property, then there is
no need to look at the plane type at all. That is the goal of the
universal planes feature.

The optimal configuration with DCN hardware is using underlays.  E.g.,
the desktop plane would be at the top and would have holes cut out of
it for videos or windows that want their own plane.  If you do it the
other way around, there are lots of limitations.

Alex

Right, patch 1/2 tries to work around one of these limitations (cursor-on-yuv).
Others have mentioned we can do the same for scaling.

Thanks,
Leo