Re: [PATCH v2] drm/amd/display: Fix two cursor duplication when using overlay

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2021-08-24 9:59 a.m., Simon Ser wrote:
Hi Rodrigo!

Thanks a lot for your reply! Comments below, please bear with me: I'm
a bit familiar with the cursor issues, but my knowledge of AMD hw is
still severely lacking.

On Wednesday, August 18th, 2021 at 15:18, Rodrigo Siqueira <Rodrigo.Siqueira@xxxxxxx> wrote:

On 08/18, Simon Ser wrote:
Hm. This patch causes a regression for me. I was using primary + overlay
not covering the whole primary plane + cursor before. This patch breaks it.

Which branch are you using? Recently, I reverted part of that patch,
see:

   Revert "drm/amd/display: Fix overlay validation by considering cursors"

Right. This revert actually makes things worse. Prior to the revert the
overlay could be enabled without the cursor. With the revert the overlay
cannot be enabled at all, even if the cursor is disabled.

This patch makes the overlay plane very useless for me, because the primary
plane is always under the overlay plane.

I'm curious about your use case with overlay planes. Could you help me
to understand it better? If possible, describe:

1. Context and scenario
2. Compositor
3. Kernel version
4. If you know which IGT test describe your test?

I'm investigating overlay issues in our driver, and a userspace
perspective might help me.

I'm working on gamescope [1], Valve's gaming compositor. Our use-cases include
displaying (from bottom to top) a game in the background, a notification popup
over it in the overlay plane, and a cursor in the cursor plane. All of the
planes might be rotated. The game's buffer might be scaled and might not cover
the whole CRTC.

libliftoff [2] is used to provide vendor-agnostic KMS plane offload. In other
words, I'd prefer to avoid relying too much on hardware specific details, e.g.
I'd prefer to avoid hole-punching via a underlay (it might work on AMD hw, but
will fail on many other drivers).

Hi Simon,

Siqueria explained a bit below, but the problem is that we don't have dedicated cursor planes in hardware.

It's easiest to under the hardware cursor as being constrained within the DRM plane specifications. Each DRM plane maps to 1 (or 2) hardware pipes and the cursor has to be drawn along with it. The cursor will inherit the scale, bounds, and color management associated with the underlying pipes.

From the kernel display driver perspective that makes things quite difficult with the existing DRM API - we can only really guarantee you get HW cursor when the framebuffer covers the entire screen and it is unscaled or matches the scaling expected by the user.

Hole punching generally satisfies both of these since it's a transparent framebuffer that covers the entire screen.

The case that's slightly more complicated is when the overlay doesn't cover the entire screen but the primary plane does. We can still enable the cursor if the primary plane and overlay have a matching scale and color management - our display hardware can draw the cursor on multiple pipes. (Note: this statement only applies for DCN2.1+)

If the overlay plane does not cover the entire screen and the scale or the color management differs then we cannot enable the HW cursor plane. As you mouse over the bounds of the overlay you will see the cursor drawn differently on the primary and overlay pipe.

If the overlay plane and primary plane do not cover the entire screen then you will lose HW cursor outside of the union of their bounds.

Correct me if I'm wrong, but I think your usecase [1] falls under the category where:
1. Primary plane covers entire screen
2. Overlay plane does not cover the entire screen
3. Overlay plane is scaled

This isn't a support configuration because HW cursor cannot be drawn in the same position on both pipes.

I think you can see a similar usecase to [1] on Windows, but the difference is that the cursor is drawn on the "primary plane" instead of on top of the primary and overlay. I don't remember if DRM has a requirement that the cursor plane must be topmost, but we can't enable [1] as long as it is.

I don't know if you have more usecases in mind than [1], but just as some general recommendations I think you should only really use overlays when they fall under one of two categories:

1. You want to save power:

You will burn additional power for the overlay pipe.

But you will save power in use cases like video playback - where the decoder produces the framebuffer and we can avoid a shader composited copy with its associated GFX engine overhead and memory traffic.

2. You want more performance:

You will burn additional power for the overlay pipe.

On bandwidth constrained systems you can save significant memory bandwidth by avoiding the shader composition by allowing for direct scanout of game or other application buffers.

Your usecase [1] falls under this category, but as an aside I discourage trying to design usecases where the compositor requires the overlay for functional purposes.

Best regards,
Nicholas Kazlauskas


I'm usually using the latest kernel (at the time of writing, v5.13.10), but I
often test with drm-tip or agd5f's amd-staging-drm-next, especially when
working on amdgpu patches.

My primary hardware of interest is RDNA 2 based (the upcoming Steam Deck), but
of course it's better if gamescope can run on a wide range of hardware.

I don't know if there's an IGT covering my use-case.

[1]: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FPlagman%2Fgamescope&amp;data=04%7C01%7CNicholas.Kazlauskas%40amd.com%7C0a5e1d2ce0874a87929e08d96707743a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637654104020179511%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=PCliYWadIaVDDnEQUOONNo%2FmC2ieIMjUw9Zr4XP3XDM%3D&amp;reserved=0
[2]: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Femersion%2Flibliftoff&amp;data=04%7C01%7CNicholas.Kazlauskas%40amd.com%7C0a5e1d2ce0874a87929e08d96707743a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637654104020179511%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=q4NCvqFpwdSXVnBcBSdxCYII44ekOiQBWTe9SUDhFUo%3D&amp;reserved=0

Basically, we cannot draw the cursor at the same size and position on
two separated pipes since it uses extra bandwidth and DML only run
with one cursor.

I don't understand this limitation. Why would it be necessary to draw the
cursor on two separate pipes? Isn't it only necessary to draw it once on
the overlay pipe, and not draw it on the primary pipe?

I will try to provide some background. Harry and Nick, feel free to
correct me or add extra information.

In the amdgpu driver and from the DRM perspective, we expose cursors as
a plane, but we don't have a real plane dedicated to cursors from the
hardware perspective. We have part of our HUPB handling cursors (see
commit "drm/amd/display: Add DCN3.1 DCHHUB" for a hardware block
overview), which requires a different way to deal with the cursor plane
since they are not planes in the hardware.

What are DCHUBBUB and MMHUBBUB responsible for? Is one handling the primary
plane and the other handling the overlay plane? Or something else entirely?

As a result, we have some
limitations, such as not support multiple cursors with overlay; to
support this, we need to deal with these aspects:

Hm, but I don't want to draw multiple cursors. I want to draw a single
cursor. If all planes are enabled, can't we paint the cursor only on the
overlay plane and not paint the cursor on the primary plane?

Or maybe it's impossible to draw the cursor on the overlay plane outside
of the overlay plane bounds?

I'm also confused by the commit message in "drm/amd/display: Fix two cursor
duplication when using overlay", because an overlay which doesn't cover the
whole CRTC used to work perfectly fine, even with the cursor plane enabled.

1. We need to make multiple cursor match in the same position and size.
Again, keep in mind that cursors are handled in the HUBP, which is the
first part of our pipe, and it is not a plane.

2. Fwiu, our Display Mode Library (DML), has gaps with multiple cursor
support, which can lead to bandwidth problems such as underflow. Part of
these limitations came from DCN1.0; the new ASIC probably can support
multiple cursors without issues.

Additionally, we fully support a strategy named underlay, which inverts
the logic around the overlay. The idea is to put the DE in the overlay
plane covering the entire screen and the other fb in the primary plane
behind the overlay (DE); this can be useful for playback video
scenarios.

Yeah, as I said above this requires knowing a lot about the target hardware,
which is a bit unfortunate. This requires hole-punching and significantly
changes the composition logic.





[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux