Quoting Chris Wilson (2019-08-23 16:10:48) > Quoting Daniele Ceraolo Spurio (2019-08-23 16:05:45) > > > > > > On 8/23/19 7:26 AM, Chris Wilson wrote: > > > Quoting Chris Wilson (2019-08-23 08:27:25) > > >> Quoting Daniele Ceraolo Spurio (2019-08-23 03:09:09) > > >>> TGL has an improved CS pre-parser that can now pre-fetch commands across > > >>> batch boundaries. This improves performances when lots of small batches > > >>> are used, but has an impact on self-modifying code. If we want to modify > > >>> the content of a batch from another ring/batch, we need to either > > >>> guarantee that the memory location is updated before the pre-parser gets > > >>> to it or we need to turn the pre-parser off around the modification. > > >>> In i915, we use self-modifying code only for GPU relocations. > > >>> > > >>> The pre-parser fetches across memory synchronization commands as well, > > >>> so the only way to guarantee that the writes land before the parser gets > > >>> to it is to have more instructions between the sync and the destination > > >>> than the parser FIFO depth, which is not an optimal solution. > > >> > > >> Well, our ABI is that memory is coherent before the breadcrumb of *each* > > >> batch. That is a fundamental requirement for our signaling to userspace. > > >> Please tell me that there is a context flag to turn this off, or we else > > >> we need to emit 32x flushes or whatever it takes. > > > > > Are you referring to the specific case where we have a request modifying > > an object that is then used as a batch in the next request? Because > > coherency of objects that are not executed as batches is not impacted. > > "Fetches across memory sync" sounds like a major ABI break. The batches > are a hard serialisation barrier, with memory coherency guaranteed prior > to the signaling at the end of one batch and clear caches guaranteed at > the start of the next. We have relocs, oa and sseu all using self-modifying code. I expect we will have PTE modifications and much more done via the GPU in the near future. All rely on the CS_STALL doing exactly what it says on the tin. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx