Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > Quoting Mika Kuoppala (2018-12-05 13:46:12) >> static void nop_submission_tasklet(unsigned long data) >> @@ -1015,6 +1025,19 @@ static void process_csb(struct intel_engine_cs *engine) >> } while (head != tail); >> >> execlists->csb_head = head; >> + >> + /* >> + * Gen11 has proven to fail wrt global observation point between >> + * entry and tail update, failing on the ordering and thus >> + * we see an old entry in the context status buffer. >> + * >> + * Forcibly evict out entries for the next gpu csb update, >> + * to increase the odds that we get a fresh entries with non >> + * working hardware. The cost for doing so comes out mostly with >> + * the wash as hardware, working or not, will need to do the >> + * invalidation before. >> + */ >> + invalidate_csb_entries(&buf[0], &buf[GEN8_CSB_ENTRIES - 1]); > > If it works, this is a stroke of genius. > > If we hypothesize that the GPU did write the CSB entries before the head > pointer and inserted a Global Observation point beforehand, then we > theorize that they merely forgot the cc protocol, the writes to system memory is > correctly, but unordered into the cpu cache. > > By using the clflush to evict our used cacheline, on the next pass we > will pull in that CSB entry cacheline back in from memory (ordered by > the rmb used for the ringbuffer) and so, if the HW engineer's > insistence that they did remember their wmb, the CSB entries will be > coherent with the head pointer. > > So we remove one piece of the puzzle at what should be negligible cost, > Reviewed-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Thank you for review and kind words, pushed. -Mika _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx