Re: [PATCH] drm/i915/selftests: Try to recover from a wedged GPU during reset tests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Tahvanainen, Jari (2017-09-19 15:24:22)
> -----Original Message-----
> From: Chris Wilson [mailto:chris@xxxxxxxxxxxxxxxxxx] 
> Sent: Tuesday, September 19, 2017 5:19 PM
> To: intel-gfx@xxxxxxxxxxxxxxxxxxxxx
> Cc: Tahvanainen, Jari <jari.tahvanainen@xxxxxxxxx>; Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx>
> Subject: Re: [PATCH] drm/i915/selftests: Try to recover from a wedged GPU during reset tests
> 
> Quoting Chris Wilson (2017-09-15 14:09:29)
> > If we see the seqno stop progressing, we abandon the test for fear 
> > that the GPU died following the reset. However, during test teardown 
> > we still wait for the GPU to idle before continuing, but we have 
> > already confirmed that the GPU is dead. Furthermore, since we are 
> > inside a reset test, we have disabled the hangchecker, and so there is 
> > no safety net and we wait indefinitely. Detect the stuck GPU and 
> > declare it wedged as a state of emergency so we can escape.
> > 
> > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> > Cc: Jari Tahvanainen <jari.tahvanainen@xxxxxxxxx>
> > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx>
> 
> >Ping?
> 
> Sorry Chris for late answer. Tried to get touch with you earlier through IRC.
> I merged the series on top of the drm-tip and executed it in HSW - no hang anymore - FAIL.
> 
> (drv_selftest:6304) igt-kmod-CRITICAL: Test assertion failure function igt_kselftest_execute, file igt_kmod.c:513:
> (drv_selftest:6304) igt-kmod-CRITICAL: Failed assertion: err == 0
> (drv_selftest:6304) igt-kmod-CRITICAL: kselftest "i915 igt__19__live_hangcheck=1 live_selftests=-1" failed: Input/output error [5]
> (drv_selftest:6304) igt-core-INFO: Stack trace:
> (drv_selftest:6304) igt-core-INFO:   #0 [__igt_fail_assert+0x101]
> (drv_selftest:6304) igt-core-INFO:   #1 [igt_kselftest_execute+0x296]
> (drv_selftest:6304) igt-core-INFO:   #2 [igt_kselftests+0x295]
> (drv_selftest:6304) igt-core-INFO:   #3 [main+0x5f]
> (drv_selftest:6304) igt-core-INFO:   #4 [__libc_start_main+0xf1]
> (drv_selftest:6304) igt-core-INFO:   #5 [_start+0x2a]
> (drv_selftest:6304) igt-core-INFO:   #6 [<unknown>+0x2a]
> ****  END  ****
> Stack trace:
>   #0 [__igt_fail_assert+0x101]
>   #1 [igt_kselftest_execute+0x296]
>   #2 [igt_kselftests+0x295]
>   #3 [main+0x5f]
>   #4 [__libc_start_main+0xf1]
>   #5 [_start+0x2a]
>   #6 [<unknown>+0x2a]
> Subtest live_hangcheck: FAIL (1.911s)

That's what it is meant to do; stop the fail from freezing the machine.
I'll take that as a
Tested-by: Jari Tahvanainen <jari.tahvanainen@xxxxxxxxx>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux