On Wed, May 09, 2018 at 03:41:44PM +0000, Stephen Bates wrote: > Christian > > > Interesting point, give me a moment to check that. That finally makes > > all the hardware I have standing around here valuable :) > > Yes. At the very least it provides an initial standards based path > for P2P DMAs across RPs which is something we have discussed on this > list in the past as being desirable. > > BTW I am trying to understand how an ATS capable EP function determines > when to perform an ATS Translation Request (ATS TR). Is there an > upstream example of the driver for your APU that uses ATS? If so, can > you provide a pointer to it. Do you provide some type of entry in the > submission queues for commands going to the APU to indicate if the > address associated with a specific command should be translated using > ATS or not? Or do you simply enable ATS and then all addresses passed > to your APU that miss the local cache result in a ATS TR? On GPU ATS is always tie to a PASID. You do not do the former without the latter (AFAICT this is not doable, maybe through some JTAG but not in normal operation). GPU are like CPU, so you have GPU threads that run against an address space. This address space use a page table (very much like the CPU page table). Now inside that page table you can point GPU virtual address to use GPU memory or use system memory. Those system memory entry can also be mark as ATS against a given PASID. On some GPU you define a window of GPU virtual address that goes through PASID & ATS (so access in that window do not go through the page table but directly through PASID & ATS). Jérôme