Re: [PATCH 02/21] drm/i915/gtt: Workaround for HW preload not flushing pdps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 13/08/15 12:42, Dave Gordon wrote:
On 13/08/15 11:12, Michel Thierry wrote:
On 8/13/2015 5:08 PM, Zhiyuan Lv wrote:
Hi Michel,

Thanks for the reply!

I yet have another question: right now the mark_tlb_dirty() will be
called if any level of PPGTT table is changed. But for the EXECLIST
context submission, we only need LRI commands if there are L3 PDP root
pointer changes right? Thanks!

mark_tlbs_dirty is not only for execlists mode, we re-used it since it
was already there.

The update is only required when a PDP is allocated.

-Michel

Doesn't that depend on whether the context is running? The LRI reload
has the side effect of flushing all current knowledge of mappings, so
every level of PD gets refreshed from memory.

If we're not updating the top level PDPs, and we know the context isn't
active, then we *assume* that lower-level PDs will be refreshed when the
context is next loaded. (This hasn't been true on all hardware, some of
which cached previously-retrieved PDs across ctx save-and-reload, and
that's one reason why there's a "Force PD Restore" bit, but we've been
told not to use it on current h/w). AFAICT, current chips don't cache
previous PDs and don't need the "Force" bit for this case.

OTOH, if we don't know whether the context is running, then we can't be
sure when (or whether) any PD updates will be seen. As long as the
changes of interest are only ever *from* NULL *to* non-NULL, we *expect*
it to work, because (we *assume*) the GPU won't cache negative results
from PD lookups; so any lookup that previously hit an invalid mapping
will be re-fetched next time it's required (and may now be good).

If we don't reload the PDPs with LRIs, then perhaps to be safe we need
to inject some other instruction that will just force a re-fetch of the
lower-level PDs from memory, without altering any top-level PDPs? In
conjunction with preallocating the top-level entries, that ought to
guarantee that the updates would be seen just before the point where
they're about to be used?

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

I found the following comment in the BSpec:

"Pre-loading of Page Directory Entries (PD load) for 32b legacy mode is not supported from Gen9 onwards. PD entries are loaded on demand when there is a miss in the PDE cache of the corresponding page walker. Any new page additions by the driver are transparent to the HW, and the new page translations will be fetched on demand. However, any removal of the pages by the driver should initiate a TLB invalidation to remove the stale entries."

So, I think that confirms that we should inject some form of TLB invalidation into the ring before the next batch uses any updated PDs. Presumably an MI_FLUSH_DW with TLB_INVALIDATE would do?

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux