Quoting Rodrigo Vivi (2017-12-19 20:49:54) > On Tue, Dec 19, 2017 at 01:14:19PM +0000, Chris Wilson wrote: > > Useful for verifying our bookkeeper when we encounter is knowing whether > > we think the engine is idle at the time of the GPU hang. > > > > References: https://bugs.freedesktop.org/show_bug.cgi?id=104305 > > Here you mention the hang as "false positive"... > if it is a false positive and we have this idle information > shouldn't we handle this differently instead of trowing the error > information and reseting the GPU? I have contemplated skipping the reset if we think the GPU is idle, but that does rather assume that we have perfect knowledge and that skipping the reset is a good thing. (Though we do differentiate between resets to restore hw state and resets to fix a GPU hang already, so maybe it's not so bad, the caveat being an explicit request to reset the GPU.) In this case, a cursory glance said the engine should be idle (RING_MODE has the idle bit, RING_HEAD == RING_TAIL and the last seqno was completed) and I wanted to confirm that the driver also thought the engine should have been idle. That would leave the question as to why hangcheck thought differently, i.e. I'm trying to narrow the cause to a particular piece of code. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx