On Tue, Mar 28, 2017 at 08:18:26PM +0000, Deucher, Alexander wrote: > > -----Original Message----- > > From: Joerg Roedel [mailto:joro@xxxxxxxxxx] > > Sent: Tuesday, March 28, 2017 8:17 AM > > To: Bjorn Helgaas > > Cc: linux-pci@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Joerg Roedel; > > Daniel Drake; Deucher, Alexander > > Subject: [PATCH] PCI: Blacklist AMD Stoney GPU devices for ATS > > > > From: Joerg Roedel <jroedel@xxxxxxx> > > > > ATS is broken on these devices. Under invalidation load, the > > GPU does not reply to invalidations anymore, causing > > Completion-wait loop timeouts on the AMD IOMMU driver side. > > Fix it by not enabling ATS on these devices. > > > > Note that below mentioned commit is not broken, it just > > triggers the issue because it might cause invalidation > > storms on devices. > > > > Fixes: b1516a14657a ('iommu/amd: Implement flush queue') > > Reported-by: Daniel Drake <drake@xxxxxxxxxxxx> > > Cc: Daniel Drake <drake@xxxxxxxxxxxx> > > Cc: Alexander Deucher <Alexander.Deucher@xxxxxxx> > > Signed-off-by: Joerg Roedel <jroedel@xxxxxxx> > > Did you see Arindam's patch from yesterday[1]? Not sure which is the proper fix, maybe both? Arindam's patch makes sense on its own, but not as a fix for this issue. It lowers the invalidation load on the GPU, but there are still ways to trigger a high invalidation rate on the device. So it might hide the issue, but not fix it. We need to disable ATS on the device if it doesn't work reliably. Joerg