Quoting Tvrtko Ursulin (2018-02-19 09:57:20) > > On 19/02/2018 09:27, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2018-02-19 09:19:47) > >> > >> Do you have a link to BSW hang? Is that obviously related to PMU? > > > > It's only occurring in this test, just looks like an issue with the > > spinner: > > > > [bsw] https://intel-gfx-ci.01.org/tree/drm-tip/kasan_2/fi-bsw-n3050/igt@perf_pmu@xxxxxxxxxxxxxxxxxxxxxxxxx > > ... > <0>[ 681.022677] perf_pmu-1516 1..s1 282520414us : execlists_submission_tasklet: bcs0 in[0]: ctx=3.1, seqno=a > <0>[ 681.022838] perf_pmu-1516 1..s1 282520580us : execlists_submission_tasklet: bcs0 cs-irq head=5 [5?], tail=0 [0?] > <0>[ 681.023001] perf_pmu-1516 1..s1 282520594us : execlists_submission_tasklet: bcs0 csb[0]: status=0x00000001:0x00000000, active=0x1 > <0>[ 681.023168] kworker/-338 1.... 298087910us : reset_common_ring: bcs0 seqno=a > <0>[ 681.023321] ksoftirq-17 1..s. 298088483us : execlists_submission_tasklet: bcs0 in[0]: ctx=3.1, seqno=a > <0>[ 681.023482] ksoftirq-17 1..s. 298088575us : execlists_submission_tasklet: bcs0 cs-irq head=0 [0], tail=1 [1] > <0>[ 681.023644] ksoftirq-17 1..s. 298088579us : execlists_submission_tasklet: bcs0 csb[1]: status=0x00000018:0x00000003, active=0x1 > <0>[ 681.023811] ksoftirq-17 1..s. 298088581us : execlists_submission_tasklet: bcs0 out[0]: ctx=3.1, seqno=a > > Everything stops. > > > [kbl] https://intel-gfx-ci.01.org/tree/drm-tip/kasan_2/fi-kbl-7560u/igt@perf_pmu@xxxxxxxxxxxxxxxxxxxxxxxxx > > ... > <0>[ 506.745332] perf_pmu-1544 3..s1 107905835us : execlists_submission_tasklet: bcs0 in[0]: ctx=3.1, seqno=a > <0>[ 506.745397] <idle>-0 2..s1 107905980us : execlists_submission_tasklet: bcs0 cs-irq head=2 [1?], tail=3 [3?] > <0>[ 506.745440] <idle>-0 2..s1 107905983us : execlists_submission_tasklet: bcs0 csb[3]: status=0x00000001:0x00000000, active=0x1 > <0>[ 506.745498] kworker/-30 3.... 120840583us : reset_common_ring: bcs0 seqno=a > <0>[ 506.745547] ksoftirq-29 3..s. 120840688us : execlists_submission_tasklet: bcs0 in[0]: ctx=3.1, seqno=a > <0>[ 506.745598] in:imklo-499 2..s1 120840710us : execlists_submission_tasklet: bcs0 cs-irq head=0 [0], tail=1 [1] > <0>[ 506.745637] in:imklo-499 2..s1 120840712us : execlists_submission_tasklet: bcs0 csb[1]: status=0x00000018:0x00000003, active=0x1 > <0>[ 506.745676] in:imklo-499 2..s1 120840713us : execlists_submission_tasklet: bcs0 out[0]: ctx=3.1, seqno=a > > Everything stops here. > > I have not idea what's happening here. In both cases I would expect the test > to have exited after the GPU hang (or at least attempt to exit!), since it > would detect it overran the timeout. > > Could it be stuck in gem_sync after the reset? Or somewhere else? I think it's that we will be throwing the calibration off if it hangs. If busy_ns = 10s, won't that generate a target idle time of 500s? -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx