On 2022-12-14 16:06, Alex Deucher wrote:
On Wed, Dec 14, 2022 at 5:56 PM Alex Hung <alex.hung@xxxxxxx> wrote:
On 2022-12-14 15:35, Alex Deucher wrote:
On Wed, Dec 14, 2022 at 5:25 PM Alex Hung <alex.hung@xxxxxxx> wrote:
On 2022-12-14 14:54, Alex Deucher wrote:
On Wed, Dec 14, 2022 at 4:50 PM Alex Hung <alex.hung@xxxxxxx> wrote:
On 2022-12-14 13:48, Alex Deucher wrote:
On Wed, Dec 14, 2022 at 3:22 PM Aurabindo Pillai
<aurabindo.pillai@xxxxxxx> wrote:
From: Alex Hung <alex.hung@xxxxxxx>
[Why]
When running IGT kms_bw test with DP monitor, some systems crash from
msleep no matter how long or short the time is.
[How]
To replace msleep with mdelay.
Can you provide a bit more info about the crash? A lot of platforms
don't support delay larger than 2-4ms so this change will generate
errors on ARM and possibly other platforms.
Alex
The msleep was introduced in eec3303de3378 for non-compliant display
port monitors but IGT's kms_bw test can cause a recent h/w to hang at
msleep(60) when calling "igt_remove_fb" in IGT
(https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Figt-gpu-tools%2F-%2Fblob%2Fmaster%2Ftests%2Fkms_bw.c%23L197&data=05%7C01%7Calex.hung%40amd.com%7C81664450189542a7bbc408dade27d0e9%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638066559844526853%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=M%2BF4H2qddXItPoUZRVyhlc5N8UF1%2FHIh8PpfT%2BCmdZ4%3D&reserved=0)
It is possible to workaround this by reversing order of
igt_remove_fb(&buffer[i]), as the following example:
igt_create_color_fb with the order buffer[0], buffer[1], buffer[2]
Hangs:
igt_remove_fb with the order buffer[0], buffer[1], buffer[2]
No hangs:
igt_remove_fb with the reversed order buffer[2], buffer[1], buffer[0]
However, IGT simply exposes the problem and it makes more sense to stop
the hang from occurring.
I also tried to remove the msleep completely and it also work, but I
didn't want to break the fix for the original problematic hardware
configuration.
Why does sleep vs delay make a difference? Is there some race that we
are not locking against?
Alex
That was my original thought but I did not find any previously. I will
investigate it again.
If mdelay(>4) isn't usable on other platforms, is it an option to use
mdelay on x86_64 only and keep msleep on other platforms or just remove
the msleep for other platforms, something like
- msleep(60);
+#ifdef CONFIG_X86_64
+ mdelay(60);
+#endif
That's pretty ugly. I'd rather try and resolve the root cause. How
important is the IGT test? What does it do? Is the test itself
correct?
Agreed, and I didn't want to add conditions around the mdelay for the
same reason. I will assume this is not an option now.
As in the previous comment, IGT can be modified to avoid the crash by
reversing the order fb is removed - though I suspect I will receive
questions why this is not fixed in kernel.
I wanted to fix this in kernel because nothing stops other user-space
applications to use the same way to crash kernel, so fixing IGT is the
second option.
Apparently causing problems on other platforms isn't an option at all so
I will try to figure out an non-mdelay solution, and then maybe an IGT
solution instead.
What hangs? The test or the kernel or the hardware?
The system becomes completely unresponsive - no keyboard, mouse nor
remote accesses.
Alex
Alex
Acked-by: Aurabindo Pillai <aurabindo.pillai@xxxxxxx>
Signed-off-by: Alex Hung <alex.hung@xxxxxxx>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@xxxxxxx>
---
drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index 913a1fe6b3da..e6251ccadb70 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -1215,7 +1215,7 @@ void dce110_blank_stream(struct pipe_ctx *pipe_ctx)
* After output is idle pattern some sinks need time to recognize the stream
* has changed or they enter protection state and hang.
*/
- msleep(60);
+ mdelay(60);
} else if (pipe_ctx->stream->signal == SIGNAL_TYPE_EDP) {
if (!link->dc->config.edp_no_power_sequencing) {
/*
--
2.39.0