On 2021-06-07 2:19 p.m., Sean Paul wrote: > On Tue, May 18, 2021 at 2:58 PM Rodrigo Siqueira > <Rodrigo.Siqueira@xxxxxxx> wrote: >> >> On 05/14, Mark Yacoub wrote: >>> On Fri, May 14, 2021 at 12:31 PM Mark Yacoub <markyacoub@xxxxxxxxxx> wrote: >>>> >>>> On Fri, May 14, 2021 at 11:28 AM Harry Wentland <harry.wentland@xxxxxxx> wrote: >>>>> >>>>> On 2021-05-14 7:47 a.m., Rodrigo Siqueira wrote: >>>>>> A few weeks ago, we saw a two cursor issue in a ChromeOS system. We >>>>>> fixed it in the commit: >>>>>> >>>>>> drm/amd/display: Fix two cursor duplication when using overlay >>>>>> (read the commit message for more details) >>>>>> >>>>>> After this change, we noticed that some IGT subtests related to >>>>>> kms_plane and kms_plane_scaling started to fail. After investigating >>>>>> this issue, we noticed that all subtests that fail have a primary plane >>>>>> covering the overlay plane, which is currently rejected by amdgpu dm. >>>>>> Fail those IGT tests highlight that our verification was too broad and >>>>>> compromises the overlay usage in our drive. This patch fixes this issue >>> nit: s/drive/driver? >>>>>> by ensuring that we only reject commits where the primary plane is not >>>>>> fully covered by the overlay when the cursor hardware is enabled. With >>>>>> this fix, all IGT tests start to pass again, which means our overlay >>>>>> support works as expected. >>>>>> >>>>>> Cc: Tianci.Yin <tianci.yin@xxxxxxx> >>>>>> Cc: Harry Wentland <harry.wentland@xxxxxxx> >>>>>> Cc: Nicholas Choi <nicholas.choi@xxxxxxx> >>>>>> Cc: Bhawanpreet Lakha <bhawanpreet.lakha@xxxxxxx> >>>>>> Cc: Nicholas Kazlauskas <Nicholas.Kazlauskas@xxxxxxx> >>>>>> Cc: Mark Yacoub <markyacoub@xxxxxxxxxx> >>>>>> Cc: Daniel Wheeler <daniel.wheeler@xxxxxxx> >>>>>> >>>>>> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@xxxxxxx> >>>>>> --- >>>>>> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 +++++++++- >>>>>> 1 file changed, 9 insertions(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c >>>>>> index ccd67003b120..9c2537a17a7b 100644 >>>>>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c >>>>>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c >>>>>> @@ -10067,7 +10067,7 @@ static int validate_overlay(struct drm_atomic_state *state) >>>>>> int i; >>>>>> struct drm_plane *plane; >>>>>> struct drm_plane_state *old_plane_state, *new_plane_state; >>>>>> - struct drm_plane_state *primary_state, *overlay_state = NULL; >>>>>> + struct drm_plane_state *primary_state, *cursor_state, *overlay_state = NULL; >>>>>> >>>>>> /* Check if primary plane is contained inside overlay */ >>>>>> for_each_oldnew_plane_in_state_reverse(state, plane, old_plane_state, new_plane_state, i) { >>>>>> @@ -10097,6 +10097,14 @@ static int validate_overlay(struct drm_atomic_state *state) >>>>>> if (!primary_state->crtc) >>>>>> return 0; >>>>>> >>>>>> + /* check if cursor plane is enabled */ >>>>>> + cursor_state = drm_atomic_get_plane_state(state, overlay_state->crtc->cursor); >>>>>> + if (IS_ERR(cursor_state)) >>>>>> + return PTR_ERR(cursor_state); >>>>>> + >>>>>> + if (drm_atomic_plane_disabling(plane->state, cursor_state)) >>>>>> + return 0; >>>>>> + >>>>> >>>>> I thought this breaks Chrome's compositor since it can't handle an atomic commit rejection >>>>> based solely on whether a cursor is enabled or not. >>> For context: To use overlays (the old/current async way), Chrome tests >>> if an overlay strategy will work by doing an atomic commit with a TEST >>> flag to check if the combination would work. If it works, it flags >>> this planes configuration as valid. As it's valid, it performs an >>> atomic page flip (which could also include the cursor). >>> So to Harry's point, you would pass an atomic test but fail on an >>> atomic page flip with the exact same configuration that's been flagged >>> as valid. Failing a pageflip causes Chrome to crash (Reset the GPU >>> process cause something went horribly wrong when it shouldn't). >> >> Hi Mark and Sean, >> >> I don't know if I fully comprehended the scenario which in my patch >> might cause a ChromeOS crash, but this is what I understood: >> > > Chrome compositor only does an atomic test when the layout geometry > changes (ie: plane is added/removed/resized/moved). This does not take > into consideration the cursor. Once the atomic test has been validated > for a given layout geometry (set of planes/fbs along with their sizes > and locations), Chrome assumes this will continue to be valid. > > So the situation I'm envisioning is if the cursor is hidden, an > overlay-based strategy will pass in the kernel. If at any time the > cursor becomes visible, the kernel will start failing commits which > causes Chrome's GPU process to crash. > > In general I'm uneasy with using the cursor in the atomic check since > it feels like it could be racy independent of the issue above. I > haven't dove into the helper code enough to prove it, just a > spidey-sense. > Isn't the cursor supposed to be just another plane? If so, shouldn't adding/removing the cursor trigger an atomic test? Is Chrome's compositor doing cursor update through legacy IOCTLs or through the ATOMIC IOCTL? Thanks, Harry > Sean > >> There is a chance that we pass the atomic check >> (DRM_MODE_ATOMIC_TEST_ONLY - TEST) but fails in the atomic page flip >> because, after use TEST, the compositor might slightly change the commit >> config by adding the cursor. The reason behind that came from the >> assumption that adds the cursor in the atomic commit config after the >> test is harmless. Is my understand correct? Please, correct me if I'm >> wrong. >> >> If the above reasoning is correct, I think the compositor should not >> assume anything extra from what it got from the atomic check. >> >> Best Regards, >> Siqueira >> >>>>> >>>>> Harry >>>>> >>>>>> /* Perform the bounds check to ensure the overlay plane covers the primary */ >>>>>> if (primary_state->crtc_x < overlay_state->crtc_x || >>>>>> primary_state->crtc_y < overlay_state->crtc_y || >>>>>> >>>>> >> >> -- >> Rodrigo Siqueira >> https://siqueira.tech/> _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx