Re: [RFC PATCH] drm/amd/display: fix bandwidth validation failure on DCN 2.1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/29/23 11:25, Melissa Wen wrote:
IGT `amdgpu/amd_color/crtc-lut-accuracy` fails right at the beginning of
the test execution, during atomic check, because DC rejects the
bandwidth state for a fb sizing 64x64. The test was previously working
with the deprecated dc_commit_state(). Now using
dc_validate_with_context() approach, the atomic check needs to perform a
full state validation. Therefore, set fast_validation to false in the
dc_validate_global_state call for atomic check.

Fixes: b8272241ff9d ("drm/amd/display: Drop dc_commit_state in favor of dc_commit_streams")
Signed-off-by: Melissa Wen <mwen@xxxxxxxxxx>
---

Hi,

It's a long story. I was inspecting this bug report:
- https://gitlab.freedesktop.org/drm/amd/-/issues/2016
and noticed the IGT test `igt@amdgpu/amd_color@crtc-lut-accuracy`
mentioned there wasn't even being executed on a laptop with DCN 2.1
(HP HP ENVY x360 Convertible 13-ay1xxx/8929). The test fails right at
the beginning due to an atomic check rejection, as below:

Starting subtest: crtc-lut-accuracy
(amd_color:14772) igt_kms-CRITICAL: Test assertion failure function igt_display_commit_atomic, file ../lib/igt_kms.c:4530:
(amd_color:14772) igt_kms-CRITICAL: Failed assertion: ret == 0
(amd_color:14772) igt_kms-CRITICAL: Last errno: 22, Invalid argument
(amd_color:14772) igt_kms-CRITICAL: error: -22 != 0
Stack trace:
   #0 ../lib/igt_core.c:1989 __igt_fail_assert()
   #1 [igt_display_commit_atomic+0x44]
   #2 ../tests/amdgpu/amd_color.c:159 __igt_unique____real_main395()
   #3 ../tests/amdgpu/amd_color.c:395 main()
   #4 ../sysdeps/nptl/libc_start_call_main.h:74 __libc_start_call_main()
   #5 ../csu/libc-start.c:128 __libc_start_main@@GLIBC_2.34()
   #6 [_start+0x21]
Subtest crtc-lut-accuracy failed.

Checking dmesg, we can see that a bandwidth validation failure causes
the atomic check rejection:

[  711.147663] amdgpu 0000:04:00.0: [drm] Mode Validation Warning: Unknown Status failed validation.
[  711.147667] [drm:amdgpu_dm_atomic_check [amdgpu]] DC global validation failure: Bandwidth validation failure (BW and Watermark) (13)
[  711.147772] [drm:amdgpu_irq_dispatch [amdgpu]] Unregistered interrupt src_id: 243 of client_id:8
[  711.148033] [drm:amdgpu_dm_atomic_check [amdgpu]] Atomic check failed with err: -22

I also noticed that the atomic check doesn't fail if I change the fb
width and height used in the test from 64 to 66, and I can get the test
execution back (and with success). However, I recall that all test cases
of IGT `amd_color` were working in the past, so I bisected and found the
first bad commit:

b8272241ff9d drm/amd/display: Drop dc_commit_state in favor of dc_commit_streams

Bringing the `dc_commit_state` machinery back also prevents the
bandwidth validation failure, but the commit above says
dc_commit_streams validation is more complete than dc_commit_state, so I
discarded this approach.

After some debugging and code inspection, I found out that avoiding fast
validation on dc_validate_global_state during atomic check solves the
issue, but I'm not sure if this change may affect performance. I
compared exec time of some IGT tests and didn't see any differences, but
I recognize it doesn't provide enough evidence.

What do you think about this change? Should I examine other things? Do
you see any potential issue that I should investigate? Could you
recommend a better approach to assess any side-effect of not enabling
fast validation in the atomic check?

Please, let me know your thoughts.

We shouldn't be doing fast updates when lock_and_validation_needed is
true, so this seems to be correct.

Which is to say, applied, thanks!

Cc: stable@xxxxxxxxxxxxxxx


Happy New Year!

Melissa

  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 2845c884398e..4f51a7ad7a3c 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -10745,7 +10745,7 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev,
  			DRM_DEBUG_DRIVER("drm_dp_mst_atomic_check() failed\n");
  			goto fail;
  		}
-		status = dc_validate_global_state(dc, dm_state->context, true);
+		status = dc_validate_global_state(dc, dm_state->context, false);
  		if (status != DC_OK) {
  			DRM_DEBUG_DRIVER("DC global validation failure: %s (%d)",
  				       dc_status_to_str(status), status);
--
Hamza





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux