On Thu, Oct 19, 2023 at 12:58:29PM +0100, Joao Martins wrote: > Sigh, I realized that Intel's pfn_to_dma_pte() (main lookup function for > map/unmap/iova_to_phys) does something a little off when it finds a non-present > PTE. It allocates a page table to it; which is not OK in this specific case (I > would argue it's neither for iova_to_phys but well maybe I misunderstand the > expectation of that API). Oh :( > AMD has no such behaviour, though that driver per your earlier suggestion might > need to wait until -rc1 for some of the refactorings get merged. Hopefully we > don't need to wait for the last 3 series of AMD Driver refactoring (?) to be > done as that looks to be more SVA related; Unless there's something more > specific you are looking for prior to introducing AMD's domain_alloc_user(). I don't think we need to wait, it just needs to go on the cleaning list. > Anyhow, let me fix this, and post an update. Perhaps it's best I target this for > -rc1 and have improved page-table walkers all at once [the iommufd_log_perf > thingie below unlikely to be part of this set right away]. I have been playing > with the AMD driver a lot more on baremetal, so I am getting confident on the > snippet below (even with big IOVA ranges). I'm also retrying to see in-house if > there's now a rev3.0 Intel machine that I can post results for -rc1 (last time > in v2 I didn't; but things could have changed). I'd rather you keep it simple and send the walkers as followups to the driver maintainers directly. > > for themselves; so more and more I need to work on something like > > iommufd_log_perf tool under tools/testing that is similar to the gup_perf to make all > > performance work obvious and 'standardized' We have a mlx5 vfio driver in rdma-core and I have been thinking it would be a nice basis for building an iommufd tester/benchmarker as it has a wide set of "easilly" triggered functionality. Jason