On Mon, Jun 06, 2022 at 06:15:13PM +0200, Ard Biesheuvel wrote: > On Mon, 6 Jun 2022 at 17:36, Russell King (Oracle) > <linux@xxxxxxxxxxxxxxx> wrote: > > On Mon, Jun 06, 2022 at 04:21:50PM +0100, Will Deacon wrote: > > > Finally, on arm(64), the DMA mapping code tries to deal with buffers > > > that are not cacheline aligned by issuing clean-and-invalidate > > > operations for the overlapping portions at each end of the buffer. I > > > don't think this makes a tonne of sense, as inevitably one of the > > > writers (either the CPU or the DMA) is going to win and somebody is > > > going to run into silent data loss. Having the caller receive > > > DMA_MAPPING_ERROR in this case would probably be better. > > > > Sadly unavoidable - people really like passing unaligned buffers to the > > DMA API, sometimes those buffers contain information that needs to be > > preserved. I really wish it wasn't that way, because it would make life > > a lot better, but it's what we've had to deal with over the years with > > the likes of the SCSI subsystem (and e.g. it's sense buffer that was > > embedded non-cacheline aligned into other structures that had to be > > DMA'd to.) > > As discussed in the thread related to Catalin's DMA granularity > series, this is something I think we should be addressing with bounce > buffering for inbound DMA. This would allow us to reduce the kmalloc > alignment as well. It depends on the size. My plan was to do bouncing only if the size is below ARCH_DMA_MINALIGN. For larger buffers, kmalloc() gives us alignment to a power of two (well, other than 96 and 192) and no bouncing needed. If some buggy driver allocates a large structure only to hope it can do DMA into unaligned parts of it while modifying the adjacent bytes, we should just mark it as broken, we can't fix it. However, if the driver doesn't modify the cache line while the DMA takes place (only expecting it to be preserved), the fix from Will to clean on __dma_map_area() is sufficient. -- Catalin