On 06/10/2015 08:26 AM, Imre Deak wrote: > On ke, 2015-06-10 at 08:10 -0700, Jesse Barnes wrote: >> On 06/10/2015 03:59 AM, Imre Deak wrote: >>> I think the discussion here is about two separate things: >>> 1. Possible ordering issue between the seqno store and the completion >>> interrupt >>> 2. Coherency issue that leaves the CPU with a stale view of the seqno >>> indefinitely, which this patch works around >>> >>> I'm confident that in my case the problem is not due to ordering. If it >>> was "only" ordering then the value would show up eventually. This is not >>> the case though, __wait_for_request will see the stale value >>> indefinitely even though it gets woken up periodically afterwards by the >>> lost IRQ logic (with hangcheck disabled). >> >> Yeah, based on your workaround it sounds like the write from the CS is >> landing in memory but failing to invalidate the associated CPU >> cacheline. I assume mapping the HWSP as uncached also works around this >> issue? > > I assume it would, but it would of course have a bigger overhead. Based > on my testing the coherency problem happens only occasionally, so for > the rest of the times we still would want to benefit from cached reads. > See especially __i915_spin_request(). Yeah, pretty sure we want it cached given how often we read from it. I was just curious if the UC mapping would address this just to narrow things down even further. >> I wonder if this is just an issue with GGTT mappings on BXT. If we had >> per context HSWPs using PPGTT (and maybe even 48 bit PPGTT) mappings, >> the snoop may be performed correctly... Looks like we don't have a >> store_dword variant for explicit coherent or incoherent buffer writes >> (though it does test PPGTT writes at least); that would make this easy >> to test. > > Well, I tried to play around with the GGTT PTE/PAT flags to see if we're > not setting something or there is a bit that is significant despite the > specification. The spec says it's only the snoop flag that matters and > everything maps to PAT index 0. That looks correct in our code. In any > case I would expect that the MI_STORE_DATA_IMM would be coherent since > this is explicitly stated in the spec. > > I also added a new testcase to gem_storedw_batches_loop that tests the > same thing via PPGTT mappings, it fails the same way. Ok interesting, so it sounds like a more general problem than just the GGTT. Thanks, Jesse _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx