On Tue, Jan 31, 2023 at 06:14:19PM -0600, Bjorn Helgaas wrote: > > AMD GPU is one of those devices. > > I guess you mean the AMD GPU has ATS, PRI, and PASID Capabilities? > And furthermore, that the GPU *always* uses Translated addresses with > PASID? I'm not versed in the spec lingo, but the GPU issues MemRd/Wrs with the translated bit set and no PASID header - which is the correct form for an address that was translated by ATS. To get to that it issues ATS requests, and only the ATS related requests will carry the PASID. ATS related requests always route to the root port, which is why it is functionally equivalent to ACS RR/UF in these cases. Translated requests always route where they are supposed to go, even with P2P and things. > And this applies even if there is no ACS or ACS doesn't support > PCI_ACS_RR and PCI_ACS_UF. > > The black screen happens because ... ? AMD GPU driver bugs blow up if it cannot setup PASID. > I couldn't figure out the NULL pointer dereference. I expected it to > be from a BUG() or similar in report_iommu_fault(), but I don't see > that. IIRC it is a buggy error unwind handling in the AMD GPU driver. Jason