Hi Robin, On Tue, Jun 1, 2021 at 2:39 PM Robin Murphy <robin.murphy@xxxxxxx> wrote: > >> The regression shows as a significant drop in throughput as measured > >> with "super_netperf" [0], > >> with measured bandwidth of ~95Gbps before and ~35Gbps after: > > I guess that must be the difference between using the flush queue > vs. strict invalidation. On closer inspection, it seems to me that > there's a subtle pre-existing bug in the AMD IOMMU driver, in that > amd_iommu_init_dma_ops() actually runs *after* amd_iommu_init_api() > has called bus_set_iommu(). Does the patch below work? Thanks for the quick response & patch. I tried it out and indeed it does solve the issue: # uname -a Linux zh-lab-node-3 5.13.0-rc3-amd-iommu+ #31 SMP Tue Jun 1 17:12:57 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux root@zh-lab-node-3:~# ./super_netperf 32 -H 172.18.0.2 95341.2 root@zh-lab-node-3:~# uname -a Linux zh-lab-node-3 5.13.0-rc3-amd-iommu-unpatched #32 SMP Tue Jun 1 17:29:34 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux root@zh-lab-node-3:~# ./super_netperf 32 -H 172.18.0.2 33989.5