On Sat, 1 Dec 2012 21:35:21 +0100, Daniel Vetter <daniel at ffwll.ch> wrote: > On Sat, Dec 01, 2012 at 05:48:50PM +0000, Chris Wilson wrote: > > Before queuing the flip but crucially after attaching the unpin-work to > > the crtc, we continue to setup the unpin-work. However, should the > > hardware fire early, we see the connected unpin-work and queue the task. > > The task then promptly runs and unpins the fb before we finish taking > > the required references or even pinning it... Havoc. > > > > To close the race, we use the flip-pending atomic to indicate when the > > flip is finally setup and enqueued. So during the flip-done processing, > > we can check more accurately whether the flip was expected. > > > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk> > > Hm, can't this logic race? > > - emit the MI_FLIP > > - flip irq happens because the gpu is idle and completes it right away > (or our thread is preempted), work->pending increments from 0 -> 1 > > - queue_flip sets work->pending to 1 -> write RING_TAIL, flush the commands to CS, begin execution of MI_FLIP I'm not happy with the explanation, but I could reliably (100%) hit the race whilst loading a 2+GiB image using eog under compiz on an 965gm with only 2GIB of ram. As soon as it hit kswapd, the system would OOPS with an unpin leak. Which means that was a flip pending/done prior to the pinning + MI_FLIP. This patch adds a strong defence against that spurious flip done, but doesn't explain where it came from. -Chris -- Chris Wilson, Intel Open Source Technology Centre