Am Mittwoch, den 25.04.2018, 13:44 -0400 schrieb Alex Deucher: > On Wed, Apr 25, 2018 at 2:41 AM, Christoph Hellwig <hch at infradead.org > > wrote: > > On Wed, Apr 25, 2018 at 02:24:36AM -0400, Alex Deucher wrote: > > > > It has a non-coherent transaction mode (which the chipset can opt to > > > > not implement and still flush), to make sure the AGP horror show > > > > doesn't happen again and GPU folks are happy with PCIe. That's at > > > > least my understanding from digging around in amd the last time we had > > > > coherency issues between intel and amd gpus. GPUs have some bits > > > > somewhere (in the pagetables, or in the buffer object description > > > > table created by userspace) to control that stuff. > > > > > > Right.  We have a bit in the GPU page table entries that determines > > > whether we snoop the CPU's cache or not. > > > > I can see how that works with the GPU on the same SOC or SOC set as the > > CPU.  But how is that going to work for a GPU that is a plain old PCIe > > card?  The cache snooping in that case is happening in the PCIe root > > complex. > > I'm not a pci expert, but as far as I know, the device sends either a > snooped or non-snooped transaction on the bus.  I think the > transaction descriptor supports a no snoop attribute.  Our GPUs have > supported this feature for probably 20 years if not more, going back > to PCI.  Using non-snooped transactions have lower latency and faster > throughput compared to snooped transactions. PCI-X (as in the thing which as all the rage before PCIe) added a attribute phase to each transaction where the requestor can enable relaxed ordering and/or no-snoop on a transaction basis. As those are strictly performance optimizations the root-complex isn't required to honor those attributes, but implementations that care about performance usually will. Regards, Lucas