On Mon, May 15, 2023 at 04:07:01PM +0000, David Laight wrote: > > From: Ruihan Li > > Sent: 15 May 2023 14:10 > > > > When hcd->localmem_pool is non-null, localmem_pool is used to allocate > > DMA memory. In this case, the dma address will be properly returned (in > > dma_handle), and dma_mmap_coherent should be used to map this memory > > into the user space. However, the current implementation uses > > pfn_remap_range, which is supposed to map normal pages. > > I've an (out of tree) driver that does the same. > Am I right in thinking that this does still work? Generally, it still works most of the time, but it can break sometimes. I am going to quote commit 2bef9aed6f0e ("usb: usbfs: correct kernel->user page attribute mismatch"), which introduces dma_mmap_coherent in usbdev_mmap, and says [1]: On some architectures (e.g. arm64) requests for IO coherent memory may use non-cachable attributes if the relevant device isn't cache coherent. If these pages are then remapped into userspace as cacheable, they may not be coherent with the non-cacheable mappings. [1] https://lore.kernel.org/all/20200504201348.1183246-1-jeremy.linton@xxxxxxx/ I think it means that if your driver deals with devices that aren't cache-coherent on arm64, using pfn_remap_range directly may cause problems. Otherwise, you may need to check the arch-specific dma mmap operation and see if it performs additional things that pfn_remap_range does not (for the arm example, arm_iommu_mmap_attrs updates the vm_page_prot field to make the pages non-cacheable if the device is not cache-coherent [2]). [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/mm/dma-mapping.c?id=f1fcbaa18b28dec10281551dfe6ed3a3ed80e3d6#n1129 > > I can't change the driver to use dma_map_coherent() because it > doesn't let me mmap from a page offset within a 16k allocation. > > In this case the memory area is an 8MB shared transfer area to an > FPGA PCIe target sparsely filled with 16kB allocation (max 512 allocs). > The discontinuous physical memory blocks appear as logically > contiguous to both the FPGA logic and when mapped to userspace. > (But not to driver code.) > > I don't really want to expose the 16k allocation size to userspace. > If we need more than 8MB then the allocation size would need > changing. > > David > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > Registration No: 1397386 (Wales) Thanks, Ruihan Li