On 19/11/2018 16:18, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2018-11-19 15:33:56)
On 19/11/2018 15:28, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2018-11-19 15:22:28)
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Add some nop instructions between recursive batch buffer start calls to
give system some breathing room. Without these, especially when coupled
with memory pressure, false GPU hangs can be observed caused by the
inability of the chip to cope.
Doesn't seem to be required. And the machines most susceptible to timer
errors due to busyspin have not show the issue.
With the memory pressure subtest, the second patch in this series, it
was make it or break it to have the nops. Without them it was GPU hangs
all around, and with them so far all clean.
First machine bsw, just applying patch 2/2,
IGT-Version: 1.23-gb6b8d829 (x86_64) (Linux: 4.20.0-rc2+ x86_64)
Using Execlists submission
Ring size: 131 batches
Starting subtest: wide-all
wide: 420 cycles: 24121.034us
Subtest wide-all: SUCCESS (47.060s)
Starting subtest: wide-contexts
wide: 340 cycles: 24893.896us
Subtest wide-contexts: SUCCESS (22.265s)
Starting subtest: wide-contexts-mempressure
wide: 232 cycles: 25153.899us
Subtest wide-contexts-mempressure: SUCCESS (23.141s)
:|
Yes, I think what I had before I cleaned up the test case was more
copy&paste of the memory pressure thread from gem_syslatency - including
the rtprio and multithreadedness. So I was possibly starving the
tasklets and who knows what not, as well as applying memory pressure.
However, fact still is adding nops to the spinner made even that monster
pass repeatedly. I'll play with it more tomorrow.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx