On Fri, Mar 31, 2023, at 11:35, Russell King (Oracle) wrote: > On Fri, Mar 31, 2023 at 10:07:28AM +0100, Russell King (Oracle) wrote: >> On Mon, Mar 27, 2023 at 02:13:11PM +0200, Arnd Bergmann wrote: >> > From: Arnd Bergmann <arnd@xxxxxxxx> >> > >> > Most ARM CPUs can have write-back caches and that require >> > cache management to be done in the dma_sync_*_for_device() >> > operation. This is typically done in both writeback and >> > writethrough mode. >> > >> > The cache-v4.S (arm720/740/7tdmi/9tdmi) and cache-v4wt.S >> > (arm920t, arm940t) implementations are the exception here, >> > and only do the cache management after the DMA is complete, >> > in the dma_sync_*_for_cpu() operation. >> > >> > Change this for consistency with the other platforms. This >> > should have no user visible effect. >> >> NAK... >> >> The reason we do cache management _after_ is to ensure that there >> is no stale data. The kernel _has_ (at the very least in the past) >> performed DMA to data structures that are embedded within other >> data structures, resulting in cache lines being shared. If one of >> those cache lines is touched while DMA is progressing, then we >> must to cache management _after_ the DMA operation has completed. >> Doing it before is no good. What I'm trying to address here is the inconsistency between implementations. If we decide that we always want to invalidate after FROM_DEVICE, I can do that as part of the series, but then I have to change most of the other arm implementations. Right now, the only WT cache implementations that do the the invalidation after the DMA are cache-v4.S (arm720 integrator and clps711x), cache-v4wt.S (arm920/arm922 at91rm9200, clps711x, ep93xx, omap15xx, imx1 and integrator), some sparc32 leon3 and early xtensa. Most architectures that have write-through caches (m68k, microblaze) or write-back caches but no speculation (all other armv4/armv5, hexagon, openrisc, sh, most mips, later xtensa) only invalidate before DMA but not after. OTOH, most machines that are actually in use today (armv6+, powerpc, later mips, microblaze, riscv, nios2) also have to deal with speculative accesses, so they end up having to invalidate or flush both before and after a DMA_FROM_DEVICE and DMA_BIDIRECTIONAL. > It looks like the main offender of "touching cache lines shared > with DMA" has now been resolved - that was the SCSI sense buffer, > and was fixed some time ago: > > commit de25deb18016f66dcdede165d07654559bb332bc > Author: FUJITA Tomonori <fujita.tomonori@xxxxxxxxxxxxx> > Date: Wed Jan 16 13:32:17 2008 +0900 > > /if/ that is the one and only case, then we're probably fine, but > having been through an era where this kind of thing was the norm > and requests to fix it did not get great responses from subsystem > maintainers, I just don't trust the kernel not to want to DMA to > overlapping cache lines. Thanks for digging that out, that is very useful. It looks like this was around the same time as 03d70617b8a7 ("powerpc: Prevent memory corruption due to cache invalidation of unaligned DMA buffer"), so it may well have been related. I know we also had more recent problems with USB drivers trying to DMA to stack, which would also cause problems on non-coherent machines, but some of these were only found after we introduced VMAP_STACK. It would be nice to use KASAN prevent reads on cache lines that have in-flight DMA. Arnd