Re: [PATCH 00/21] dma-mapping: unify support for cache flushes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Arnd,

On Mon, Mar 27, 2023 at 1:14 PM Arnd Bergmann <arnd@xxxxxxxxxx> wrote:
>
> From: Arnd Bergmann <arnd@xxxxxxxx>
>
> After a long discussion about adding SoC specific semantics for when
> to flush caches in drivers/soc/ drivers that we determined to be
> fundamentally flawed[1], I volunteered to try to move that logic into
> architecture-independent code and make all existing architectures do
> the same thing.
>
> As we had determined earlier, the behavior is wildly different across
> architectures, but most of the differences come down to either bugs
> (when required flushes are missing) or extra flushes that are harmless
> but might hurt performance.
>
> I finally found the time to come up with an implementation of this, which
> starts by replacing every outlier with one of the three common options:
>
>  1. architectures without speculative prefetching (hegagon, m68k,
>     openrisc, sh, sparc, and certain armv4 and xtensa implementations)
>     only flush their caches before a DMA, by cleaning write-back caches
>     (if any) before a DMA to the device, and by invalidating the caches
>     before a DMA from a device
>
>  2. arc, microblaze, mips, nios2, sh and later xtensa now follow the
>     normal 32-bit arm model and invalidate their writeback caches
>     again after a DMA from the device, to remove stale cache lines
>     that got prefetched during the DMA. arc, csky and mips used to
>     invalidate buffers also before the bidirectional DMA, but this
>     is now skipped whenever we know it gets invalidated again
>     after the DMA.
>
>  3. parisc, powerpc and riscv already flushed buffers before
>     a DMA_FROM_DEVICE, and these get moved to the arm64 behavior
>     that does the writeback before and invalidate after both
>     DMA_FROM_DEVICE and DMA_BIDIRECTIONAL in order to avoid the
>     problem of accidentally leaking stale data if the DMA does
>     not actually happen[2].
>
> The last patch in the series replaces the architecture specific code
> with a shared version that implements all three based on architecture
> specific parameters that are almost always determined at compile time.
>
> The difference between cases 1. and 2. is hardware specific, while between
> 2. and 3. we need to decide which semantics we want, but I explicitly
> avoid this question in my series and leave it to be decided later.
>
> Another difference that I do not address here is what cache invalidation
> does for partical cache lines. On arm32, arm64 and powerpc, a partial
> cache line always gets written back before invalidation in order to
> ensure that data before or after the buffer is not discarded. On all
> other architectures, the assumption is cache lines are never shared
> between DMA buffer and data that is accessed by the CPU. If we end up
> always writing back dirty cache lines before a DMA (option 3 above),
> then this point becomes moot, otherwise we should probably address this
> in a follow-up series to document one behavior or the other and implement
> it consistently.
>
> Please review!
>
>       Arnd
>
> [1] https://lore.kernel.org/all/20221212115505.36770-1-prabhakar.mahadev-lad.rj@xxxxxxxxxxxxxx/
> [2] https://lore.kernel.org/all/20220606152150.GA31568@willie-the-truck/
>
> Arnd Bergmann (21):
>   openrisc: dma-mapping: flush bidirectional mappings
>   xtensa: dma-mapping: use normal cache invalidation rules
>   sparc32: flush caches in dma_sync_*for_device
>   microblaze: dma-mapping: skip extra DMA flushes
>   powerpc: dma-mapping: split out cache operation logic
>   powerpc: dma-mapping: minimize for_cpu flushing
>   powerpc: dma-mapping: always clean cache in _for_device() op
>   riscv: dma-mapping: only invalidate after DMA, not flush
>   riscv: dma-mapping: skip invalidation before bidirectional DMA
>   csky: dma-mapping: skip invalidating before DMA from device
>   mips: dma-mapping: skip invalidating before bidirectional DMA
>   mips: dma-mapping: split out cache operation logic
>   arc: dma-mapping: skip invalidating before bidirectional DMA
>   parisc: dma-mapping: use regular flush/invalidate ops
>   ARM: dma-mapping: always invalidate WT caches before DMA
>   ARM: dma-mapping: bring back dmac_{clean,inv}_range
>   ARM: dma-mapping: use arch_sync_dma_for_{device,cpu}() internally
>   ARM: drop SMP support for ARM11MPCore
>   ARM: dma-mapping: use generic form of arch_sync_dma_* helpers
>   ARM: dma-mapping: split out arch_dma_mark_clean() helper
>   dma-mapping: replace custom code with generic implementation
>
Do you plan to send v2 for this series?

Cheers,
Prabhakar

>  arch/arc/mm/dma.c                          |  66 ++------
>  arch/arm/Kconfig                           |   4 +
>  arch/arm/include/asm/cacheflush.h          |  21 +++
>  arch/arm/include/asm/glue-cache.h          |   4 +
>  arch/arm/mach-oxnas/Kconfig                |   4 -
>  arch/arm/mach-oxnas/Makefile               |   1 -
>  arch/arm/mach-oxnas/headsmp.S              |  23 ---
>  arch/arm/mach-oxnas/platsmp.c              |  96 -----------
>  arch/arm/mach-versatile/platsmp-realview.c |   4 -
>  arch/arm/mm/Kconfig                        |  19 ---
>  arch/arm/mm/cache-fa.S                     |   4 +-
>  arch/arm/mm/cache-nop.S                    |   6 +
>  arch/arm/mm/cache-v4.S                     |  13 +-
>  arch/arm/mm/cache-v4wb.S                   |   4 +-
>  arch/arm/mm/cache-v4wt.S                   |  22 ++-
>  arch/arm/mm/cache-v6.S                     |  35 +---
>  arch/arm/mm/cache-v7.S                     |   6 +-
>  arch/arm/mm/cache-v7m.S                    |   4 +-
>  arch/arm/mm/dma-mapping-nommu.c            |  36 ++--
>  arch/arm/mm/dma-mapping.c                  | 181 ++++++++++-----------
>  arch/arm/mm/proc-arm1020.S                 |   4 +-
>  arch/arm/mm/proc-arm1020e.S                |   4 +-
>  arch/arm/mm/proc-arm1022.S                 |   4 +-
>  arch/arm/mm/proc-arm1026.S                 |   4 +-
>  arch/arm/mm/proc-arm920.S                  |   4 +-
>  arch/arm/mm/proc-arm922.S                  |   4 +-
>  arch/arm/mm/proc-arm925.S                  |   4 +-
>  arch/arm/mm/proc-arm926.S                  |   4 +-
>  arch/arm/mm/proc-arm940.S                  |   4 +-
>  arch/arm/mm/proc-arm946.S                  |   4 +-
>  arch/arm/mm/proc-feroceon.S                |   8 +-
>  arch/arm/mm/proc-macros.S                  |   2 +
>  arch/arm/mm/proc-mohawk.S                  |   4 +-
>  arch/arm/mm/proc-xsc3.S                    |   4 +-
>  arch/arm/mm/proc-xscale.S                  |   6 +-
>  arch/arm64/mm/dma-mapping.c                |  28 ++--
>  arch/csky/mm/dma-mapping.c                 |  46 +++---
>  arch/hexagon/kernel/dma.c                  |  44 ++---
>  arch/m68k/kernel/dma.c                     |  43 +++--
>  arch/microblaze/kernel/dma.c               |  38 ++---
>  arch/mips/mm/dma-noncoherent.c             |  75 +++------
>  arch/nios2/mm/dma-mapping.c                |  57 +++----
>  arch/openrisc/kernel/dma.c                 |  62 ++++---
>  arch/parisc/include/asm/cacheflush.h       |   6 +-
>  arch/parisc/kernel/pci-dma.c               |  33 +++-
>  arch/powerpc/mm/dma-noncoherent.c          |  76 +++++----
>  arch/riscv/mm/dma-noncoherent.c            |  51 +++---
>  arch/sh/kernel/dma-coherent.c              |  43 +++--
>  arch/sparc/Kconfig                         |   2 +-
>  arch/sparc/kernel/ioport.c                 |  38 +++--
>  arch/xtensa/Kconfig                        |   1 -
>  arch/xtensa/include/asm/cacheflush.h       |   6 +-
>  arch/xtensa/kernel/pci-dma.c               |  47 +++---
>  include/linux/dma-sync.h                   | 107 ++++++++++++
>  54 files changed, 721 insertions(+), 699 deletions(-)
>  delete mode 100644 arch/arm/mach-oxnas/headsmp.S
>  delete mode 100644 arch/arm/mach-oxnas/platsmp.c
>  create mode 100644 include/linux/dma-sync.h
>
> --
> 2.39.2
>
> Cc: Vineet Gupta <vgupta@xxxxxxxxxx>
> Cc: Russell King <linux@xxxxxxxxxxxxxxx>
> Cc: Neil Armstrong <neil.armstrong@xxxxxxxxxx>
> Cc: Linus Walleij <linus.walleij@xxxxxxxxxx>
> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> Cc: Will Deacon <will@xxxxxxxxxx>
> Cc: Guo Ren <guoren@xxxxxxxxxx>
> Cc: Brian Cain <bcain@xxxxxxxxxxx>
> Cc: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx>
> Cc: Michal Simek <monstr@xxxxxxxxx>
> Cc: Thomas Bogendoerfer <tsbogend@xxxxxxxxxxxxxxxx>
> Cc: Dinh Nguyen <dinguyen@xxxxxxxxxx>
> Cc: Stafford Horne <shorne@xxxxxxxxx>
> Cc: Helge Deller <deller@xxxxxx>
> Cc: Michael Ellerman <mpe@xxxxxxxxxxxxxx>
> Cc: Christophe Leroy <christophe.leroy@xxxxxxxxxx>
> Cc: Paul Walmsley <paul.walmsley@xxxxxxxxxx>
> Cc: Palmer Dabbelt <palmer@xxxxxxxxxxx>
> Cc: Rich Felker <dalias@xxxxxxxx>
> Cc: John Paul Adrian Glaubitz <glaubitz@xxxxxxxxxxxxxxxxxxx>
> Cc: "David S. Miller" <davem@xxxxxxxxxxxxx>
> Cc: Max Filippov <jcmvbkbc@xxxxxxxxx>
> Cc: Christoph Hellwig <hch@xxxxxx>
> Cc: Robin Murphy <robin.murphy@xxxxxxx>
> Cc: Lad Prabhakar <prabhakar.mahadev-lad.rj@xxxxxxxxxxxxxx>
> Cc: Conor Dooley <conor.dooley@xxxxxxxxxxxxx>
> Cc: linux-snps-arc@xxxxxxxxxxxxxxxxxxx
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> Cc: linux-oxnas@xxxxxxxxx
> Cc: linux-csky@xxxxxxxxxxxxxxx
> Cc: linux-hexagon@xxxxxxxxxxxxxxx
> Cc: linux-m68k@xxxxxxxxxxxxxxxxxxxx
> Cc: linux-mips@xxxxxxxxxxxxxxx
> Cc: linux-openrisc@xxxxxxxxxxxxxxx
> Cc: linux-parisc@xxxxxxxxxxxxxxx
> Cc: linuxppc-dev@xxxxxxxxxxxxxxxx
> Cc: linux-riscv@xxxxxxxxxxxxxxxxxxx
> Cc: linux-sh@xxxxxxxxxxxxxxx
> Cc: sparclinux@xxxxxxxxxxxxxxx
> Cc: linux-xtensa@xxxxxxxxxxxxxxxx
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@xxxxxxxxxxxxxxxxxxx
> http://lists.infradead.org/mailman/listinfo/linux-riscv




[Index of Archives]     [Kernel Development]     [DCCP]     [Linux ARM Development]     [Linux]     [Photo]     [Yosemite Help]     [Linux ARM Kernel]     [Linux SCSI]     [Linux x86_64]     [Linux Hams]

  Powered by Linux