Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > On Mon, Mar 27, 2017 at 01:44:00PM +0300, Mika Kuoppala wrote: >> Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: >> >> > If the request->wa_tail is 0 (because it landed exactly on the end of >> > the ringbuffer), when we reconstruct request->tail following a reset we >> > fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is >> > never able to catch up with RING_TAIL and the GPU spins endlessly. If >> > the ring contains a couple of breadcrumbs, even our hangcheck is unable >> > to catch the busy-looping as the ACTHD and seqno continually advance. >> >> Tail is past ring size (on hw) and the ring contents has seqno writes. >> So we will replay the ring contents over and over and seqno advances >> and wraps back to the first breadcrumbs in ring? > > Yup. It was most confusing to watch. The execlist_port[] was static, > RING_START was static, yet the seqno kept changing. I felt like I was > hallucinating. That or insomnia. /o\ When we reset_common_ring() it is always after a hw reset. So the 'last' in sense of hardware's lrc contexts doesn't mean much. So can we actually get rid of the tail trickery as for first request after reset, as the lite restore can't happen and should not matter? -Mika > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx