On Thu, Mar 20, 2025 at 02:35:41PM +0100, Arnd Bergmann wrote: > On Thu, Mar 20, 2025, at 13:29, Mark Brown wrote: > > On a lot of systems most transfers are short and won't be DMAed at all > > since PIO ends up being more efficient, and most hardware is perfectly > > happy to DMA to/from wherever so *shrug*. SPI_BUFSIZ is a maximum of 32 > > bytes which is going to be under the copybreak limit for quite a few > > controllers, though admittedly 16 is also a popular number, so a lot of > > the time we don't actually DMA out of it at all. > I saw the same thing looked at it the other day and got confused > about why 'local_buf' is allocated with GFP_DMA and 'buf' > uses plain GFP_KERNEL when they are both used in the same place. Really we just don't expect the small buffer to be DMAed. > It also seems that the copy happens in the regmap_bulk_read() > path but not the regmap_bulk_write(), which just passes down > the original buffer without copying, as far as I can tell. Yes, writes don't need to do anything. > From what I found, there are two scenarios that may depend on > GFP_DMA today: > > a) a performance optimization where allocating from GFP_DMA avoids > the swiotlb bounce buffering. This would be the normal case on > any 64-bit machine with more than 4GB of RAM and an SPI > controller with a 32-bit DMA mask. > b) An SPI controller on a 32-bit machine without swiotlb and an > effective DMA mask that covers less than the lowmem area. > E.g. on Raspberry Pi 4, the brcm,bcm2835-spi lives on a > bus with an 1GB dma-ranges translation, but there may be more > than 1GB of lowmem with CONFIG_VMSPLIT_2G=y and CONFIG_SWIOTLB=n. > Without GFP_DMA that would just end up causing data corruption. That sounds about right. > I think we have some corner cases where a driver allocates > a GFP_DMA buffer, calls spi_write_then_read through regmap, > which copies the data to the non-GFP_DMA global buffer, > and then the SPI controller driver calls dma_map_single() > on that, ending up with a third bounce buffer from > swiotlb. I suspect that practically speaking almost everything will be under the copybreak limit. Probably we should just make regmap use spi_sync() with the supplied buffers here and avoid spi_write_then_read(). > I don't know what a good replacement interface would be, but > ideally there should never be more than one copy here, > which means that any temporary buffer would need to be > allocated according to the dma_mask of the underlying > bus master (dmaengine, spi controller, ...). Which is a pain because even if you've got the device for the SPI controller there's no way to discover if it does it's own DMA.
Attachment:
signature.asc
Description: PGP signature