On Wed, Aug 14, 2024 at 09:37:15AM -0300, Jason Gunthorpe wrote: > > Currently, only x86_64 (1G+2M) and arm64 (2M) are supported. > > There is definitely interest here in extending ARM to support the 1G > size too, what is missing? Currently PUD pfnmap relies on THP_PUD config option: config ARCH_SUPPORTS_PUD_PFNMAP def_bool y depends on ARCH_SUPPORTS_HUGE_PFNMAP && HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD Arm64 unfortunately doesn't yet support dax 1G, so not applicable yet. Ideally, pfnmap is too simple comparing to real THPs and it shouldn't require to depend on THP at all, but we'll need things like below to land first: https://lore.kernel.org/r/20240717220219.3743374-1-peterx@xxxxxxxxxx I sent that first a while ago, but I didn't collect enough inputs, and I decided to unblock this series from that, so x86_64 shouldn't be affected, and arm64 will at least start to have 2M. > > > The other trick is how to allow gup-fast working for such huge mappings > > even if there's no direct sign of knowing whether it's a normal page or > > MMIO mapping. This series chose to keep the pte_special solution, so that > > it reuses similar idea on setting a special bit to pfnmap PMDs/PUDs so that > > gup-fast will be able to identify them and fail properly. > > Make sense > > > More architectures / More page sizes > > ------------------------------------ > > > > Currently only x86_64 (2M+1G) and arm64 (2M) are supported. > > > > For example, if arm64 can start to support THP_PUD one day, the huge pfnmap > > on 1G will be automatically enabled. > > Oh that sounds like a bigger step.. Just to mention, no real THP 1G needed here for pfnmaps. The real gap here is only about the pud helpers that only exists so far with CONFIG_THP_PUD in huge_memory.c. > > > VFIO is so far the only consumer for the huge pfnmaps after this series > > applied. Besides above remap_pfn_range() generic optimization, device > > driver can also try to optimize its mmap() on a better VA alignment for > > either PMD/PUD sizes. This may, iiuc, normally require userspace changes, > > as the driver doesn't normally decide the VA to map a bar. But I don't > > think I know all the drivers to know the full picture. > > How does alignment work? In most caes I'm aware of the userspace does > not use MAP_FIXED so the expectation would be for the kernel to > automatically select a high alignment. I suppose your cases are > working because qemu uses MAP_FIXED and naturally aligns the BAR > addresses? > > > - x86_64 + AMD GPU > > - Needs Alex's modified QEMU to guarantee proper VA alignment to make > > sure all pages to be mapped with PUDs > > Oh :( So I suppose this answers above. :) Yes, alignment needed. -- Peter Xu