Quoting Chang, Bruce (2020-06-10 05:25:39) > On 6/9/2020 8:17 AM, Chris Wilson wrote: > > In commit 5ba32c7be81e ("drm/i915/execlists: Always force a context > > reload when rewinding RING_TAIL"), we placed the check for rewinding a > > context on actually submitting the next request in that context. This > > was so that we only had to check once, and could do so with precision > > avoiding as many forced restores as possible. For example, to ensure > > that we can resubmit the same request a couple of times, we include a > > small wa_tail such that on the next submission, the ring->tail will > > appear to move forwards when resubmitting the same request. This is very > > common as it will happen for every lite-restore to fill the second port > > after a context switch. > > > > However, intel_ring_direction() is limited in precision to movements of > > upto half the ring size. The consequence being that if we tried to > > unwind many requests, we could exceed half the ring and flip the sense > > of the direction, so missing a force restore. As no request can be > > greater than half the ring (i.e. 2048 bytes in the smallest case), we > > can check for rollback incrementally. As we check against the tail that > > would be submitted, we do not lose any sensitivity and allow lite > > restores for the simple case. We still need to double check upon > > submitting the context, to allow for multiple preemptions and > > resubmissions. > > > > Fixes: 5ba32c7be81e ("drm/i915/execlists: Always force a context reload when rewinding RING_TAIL") > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > > Cc: <stable@xxxxxxxxxxxxxxx> # v5.4+ > > Verified this has fixed the issue regarding the GPU hang with incomplete > error state. But it does not entirely... tgl b0 still has the issue of a lite restore being processed while it is doing an [implicit] semaphore wait at just the wrong time, dies (or something that looks suspiciously like that). That can be reproduced without any preemption rollback, so I suspect a placebo effect. -Chris