On Mon, Sep 08, 2014 at 09:15:50AM +0100, Chris Wilson wrote: > On Mon, Sep 08, 2014 at 10:03:51AM +0200, Daniel Vetter wrote: > > On Sun, Sep 07, 2014 at 09:08:31AM +0100, Chris Wilson wrote: > > > Running igt, I was encountering the invalid TLB bug on my 845g, despite > > > that it was using the CS workaround. Examining the w/a buffer in the > > > error state, showed that the copy from the user batch into the > > > workaround itself was suffering from the invalid TLB bug (the first > > > cacheline was broken with the first two words reversed). Time to try a > > > fresh approach. This extends the workaround to write into each page of > > > our scratch buffer in order to overflow the TLB and evict the invalid > > > entries. This could be refined to only do so after we update the GTT, > > > but for simplicity, we do it before each batch. > > > > > > I suspect this supersedes our current workaround, but for safety keep > > > doing both. > > > > I suspect that we might end up with just an elaborate delay > > implementation, but if it works then it's good. One nitpick below, with > > that addressed this is Reviewed-by: Daniel Vetter <daniel.vetter@xxxxxxxx> > > One way to test that is simply comparing 64x4096 byte writes in the same > page vs 64x4 byte writes in 64 different pages. That should be roughly > the same latency (thought with TLB fetches you never be too sure) and > demonstrate that it is either the TLB or the delay that's the factor. Quick update: Wrote 256k into one page (instead of 4 byte write into each of 64 pages), hopefully testing the delay theory, and found it did not prevent the corruption/hang. Now trying to refine the estimate on the number of TLBs. -Chris -- Chris Wilson, Intel Open Source Technology Centre -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html