Quoting Tvrtko Ursulin (2018-11-19 19:18:52) > > On 19/11/2018 16:18, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2018-11-19 15:33:56) > >> > >> On 19/11/2018 15:28, Chris Wilson wrote: > >>> Quoting Tvrtko Ursulin (2018-11-19 15:22:28) > >>>> From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >>>> > >>>> Add some nop instructions between recursive batch buffer start calls to > >>>> give system some breathing room. Without these, especially when coupled > >>>> with memory pressure, false GPU hangs can be observed caused by the > >>>> inability of the chip to cope. > >>> > >>> Doesn't seem to be required. And the machines most susceptible to timer > >>> errors due to busyspin have not show the issue. > >> > >> With the memory pressure subtest, the second patch in this series, it > >> was make it or break it to have the nops. Without them it was GPU hangs > >> all around, and with them so far all clean. > > > > First machine bsw, just applying patch 2/2, > > > > IGT-Version: 1.23-gb6b8d829 (x86_64) (Linux: 4.20.0-rc2+ x86_64) > > Using Execlists submission > > Ring size: 131 batches > > Starting subtest: wide-all > > wide: 420 cycles: 24121.034us > > Subtest wide-all: SUCCESS (47.060s) > > Starting subtest: wide-contexts > > wide: 340 cycles: 24893.896us > > Subtest wide-contexts: SUCCESS (22.265s) > > Starting subtest: wide-contexts-mempressure > > wide: 232 cycles: 25153.899us > > Subtest wide-contexts-mempressure: SUCCESS (23.141s) > > > > :| > > Yes, I think what I had before I cleaned up the test case was more > copy&paste of the memory pressure thread from gem_syslatency - including > the rtprio and multithreadedness. So I was possibly starving the > tasklets and who knows what not, as well as applying memory pressure. > However, fact still is adding nops to the spinner made even that monster > pass repeatedly. I'll play with it more tomorrow. I am a bit nervous about using the noops to avoid the issue, as I presume that there is a more realistic workload that could generate similar system latencies, i.e. that there exists a pathological case that users will hit for similar stalls (gem shrinker perchance). -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx