On Thu, Aug 13, 2015 at 01:03:30PM +0100, Dave Gordon wrote: > On 13/08/15 12:42, Dave Gordon wrote: > >On 13/08/15 11:12, Michel Thierry wrote: > >>On 8/13/2015 5:08 PM, Zhiyuan Lv wrote: > >>>Hi Michel, > >>> > >>>Thanks for the reply! > >>> > >>>I yet have another question: right now the mark_tlb_dirty() will be > >>>called if any level of PPGTT table is changed. But for the EXECLIST > >>>context submission, we only need LRI commands if there are L3 PDP root > >>>pointer changes right? Thanks! > >> > >>mark_tlbs_dirty is not only for execlists mode, we re-used it since it > >>was already there. > >> > >>The update is only required when a PDP is allocated. > >> > >>-Michel > > > >Doesn't that depend on whether the context is running? The LRI reload > >has the side effect of flushing all current knowledge of mappings, so > >every level of PD gets refreshed from memory. > > > >If we're not updating the top level PDPs, and we know the context isn't > >active, then we *assume* that lower-level PDs will be refreshed when the > >context is next loaded. (This hasn't been true on all hardware, some of > >which cached previously-retrieved PDs across ctx save-and-reload, and > >that's one reason why there's a "Force PD Restore" bit, but we've been > >told not to use it on current h/w). AFAICT, current chips don't cache > >previous PDs and don't need the "Force" bit for this case. > > > >OTOH, if we don't know whether the context is running, then we can't be > >sure when (or whether) any PD updates will be seen. As long as the > >changes of interest are only ever *from* NULL *to* non-NULL, we *expect* > >it to work, because (we *assume*) the GPU won't cache negative results > >from PD lookups; so any lookup that previously hit an invalid mapping > >will be re-fetched next time it's required (and may now be good). > > > >If we don't reload the PDPs with LRIs, then perhaps to be safe we need > >to inject some other instruction that will just force a re-fetch of the > >lower-level PDs from memory, without altering any top-level PDPs? In > >conjunction with preallocating the top-level entries, that ought to > >guarantee that the updates would be seen just before the point where > >they're about to be used? > > > >.Dave. > >_______________________________________________ > >Intel-gfx mailing list > >Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > >http://lists.freedesktop.org/mailman/listinfo/intel-gfx > > I found the following comment in the BSpec: > > "Pre-loading of Page Directory Entries (PD load) for 32b legacy mode > is not supported from Gen9 onwards. PD entries are loaded on demand > when there is a miss in the PDE cache of the corresponding page > walker. Any new page additions by the driver are transparent to the > HW, and the new page translations will be fetched on demand. > However, any removal of the pages by the driver should initiate a > TLB invalidation to remove the stale entries." > > So, I think that confirms that we should inject some form of TLB > invalidation into the ring before the next batch uses any updated > PDs. Presumably an MI_FLUSH_DW with TLB_INVALIDATE would do? Hi Dave and Michel, So the conclusion is still the same: that for 32b legacy mode, emit_pdps() is only needed for PDP changes. Other level page table changes can be handled by TLB_INVALIDATE with ring buffer commands. Is that correct? Thanks! Regards, -Zhiyuan > > .Dave. > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > http://lists.freedesktop.org/mailman/listinfo/intel-gfx _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx