RE: [PATCH 21/21] dma-mapping: replace custom code with generic implementation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

FYI, this patch breaks on RZ/G2L SMARC EVK board and Arnd will send V2 for fixing this issue.

[10:53] <biju> [    3.384408] Unable to handle kernel paging request at virtual address 000000004afb0080
[10:53] <biju> [    3.392755] Mem abort info:
[10:53] <biju> [    3.395883]   ESR = 0x0000000096000144
[10:53] <biju> [    3.399957]   EC = 0x25: DABT (current EL), IL = 32 bits
[10:53] <biju> [    3.405674]   SET = 0, FnV = 0
[10:53] <biju> [    3.408978]   EA = 0, S1PTW = 0
[10:53] <biju> [    3.412442]   FSC = 0x04: level 0 translation fault
[10:53] <biju> [    3.417825] Data abort info:
[10:53] <biju> [    3.420959]   ISV = 0, ISS = 0x00000144
[10:53] <biju> [    3.425115]   CM = 1, WnR = 1
[10:53] <biju> [    3.428521] [000000004afb0080] user address but active_mm is swapper
[10:53] <biju> [    3.435135] Internal error: Oops: 0000000096000144 [#1] PREEMPT SMP
[10:53] <biju> [    3.441501] Modules linked in:
[10:53] <biju> [    3.444644] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.3.0-rc6-next-20230412-g2936e9299572 #712
[10:53] <biju> [    3.453537] Hardware name: Renesas SMARC EVK based on r9a07g054l2 (DT)
[10:53] <biju> [    3.460130] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[10:53] <biju> [    3.467184] pc : dcache_clean_poc+0x20/0x38
[10:53] <biju> [    3.471488] lr : arch_sync_dma_for_device+0x1c/0x2c
[10:53] <biju> [    3.476463] sp : ffff80000a70b970
[10:53] <biju> [    3.479834] x29: ffff80000a70b970 x28: 0000000000000000 x27: ffff00000aef7c10
[10:53] <biju> [    3.487118] x26: ffff00000afb0080 x25: ffff00000b710000 x24: ffff00000b710a40
[10:53] <biju> [    3.494397] x23: 0000000000002000 x22: 0000000000000000 x21: 0000000000000002
[10:53] <biju> [    3.501670] x20: ffff00000aef7c10 x19: 000000004afb0080 x18: 0000000000000000
[10:53] <biju> [    3.508943] x17: 0000000000000100 x16: fffffc0001efc008 x15: 0000000000000000
[10:53] <biju> [    3.516216] x14: 0000000000000100 x13: 0000000000000068 x12: ffff00007fc0aa50
[10:54] <biju> [    3.523488] x11: ffff00007fc0a9c0 x10: 0000000000000000 x9 : ffff00000aef7f08
[10:54] <biju> [    3.530761] x8 : 0000000000000000 x7 : fffffc00002bec00 x6 : 0000000000000000
[10:54] <biju> [    3.538028] x5 : 0000000000000000 x4 : 0000000000000002 x3 : 000000000000003f
[10:54] <biju> [    3.545297] x2 : 0000000000000040 x1 : 000000004afb2080 x0 : 000000004afb0080
[10:54] <biju> [    3.552569] Call trace:
[10:54] <biju> [    3.555074]  dcache_clean_poc+0x20/0x38
[10:54] <biju> [    3.559014]  dma_map_page_attrs+0x1b4/0x248
[10:54] <biju> [    3.563289]  ravb_rx_ring_format_gbeth+0xd8/0x198
[10:54] <biju> [    3.568095]  ravb_ring_format+0x5c/0x108
[10:54] <biju> [    3.572108]  ravb_dmac_init_gbeth+0x30/0xe4
[10:54] <biju> [    3.576382]  ravb_dmac_init+0x80/0x104
[10:54] <biju> [    3.580222]  ravb_open+0x84/0x78c
[10:54] <biju> [    3.583626]  __dev_open+0xec/0x1d8
[10:54] <biju> [    3.587138]  __dev_change_flags+0x190/0x208
[10:54] <biju> [    3.591406]  dev_change_flags+0x24/0x6c
[10:54] <biju> [    3.595324]  ip_auto_config+0x248/0x10ac
[10:54] <biju> [    3.599345]  do_one_initcall+0x6c/0x1b0
[10:54] <biju> [    3.603268]  kernel_init_freeable+0x1c0/0x294


Cheers,
Biju

-----Original Message-----
From: linux-arm-kernel <linux-arm-kernel-bounces@xxxxxxxxxxxxxxxxxxx> On
Behalf Of Arnd Bergmann
Sent: Monday, March 27, 2023 1:13 PM
To: linux-kernel@xxxxxxxxxxxxxxx
Cc: Arnd Bergmann <arnd@xxxxxxxx>; Vineet Gupta <vgupta@xxxxxxxxxx>; Russell
King <linux@xxxxxxxxxxxxxxx>; Neil Armstrong <neil.armstrong@xxxxxxxxxx>;
Linus Walleij <linus.walleij@xxxxxxxxxx>; Catalin Marinas
<catalin.marinas@xxxxxxx>; Will Deacon <will@xxxxxxxxxx>; Guo Ren
<guoren@xxxxxxxxxx>; Brian Cain <bcain@xxxxxxxxxxx>; Geert Uytterhoeven
<geert@xxxxxxxxxxxxxx>; Michal Simek <monstr@xxxxxxxxx>; Thomas Bogendoerfer
<tsbogend@xxxxxxxxxxxxxxxx>; Dinh Nguyen <dinguyen@xxxxxxxxxx>; Stafford
Horne <shorne@xxxxxxxxx>; Helge Deller <deller@xxxxxx>; Michael Ellerman
<mpe@xxxxxxxxxxxxxx>; Christophe Leroy <christophe.leroy@xxxxxxxxxx>; Paul
Walmsley <paul.walmsley@xxxxxxxxxx>; Palmer Dabbelt <palmer@xxxxxxxxxxx>;
Rich Felker <dalias@xxxxxxxx>; John Paul Adrian Glaubitz
<glaubitz@xxxxxxxxxxxxxxxxxxx>; David S. Miller <davem@xxxxxxxxxxxxx>; Max
Filippov <jcmvbkbc@xxxxxxxxx>; Christoph Hellwig <hch@xxxxxx>; Robin Murphy
<robin.murphy@xxxxxxx>; Prabhakar Mahadev Lad <prabhakar.mahadev-
lad.rj@xxxxxxxxxxxxxx>; Conor Dooley <conor.dooley@xxxxxxxxxxxxx>; linux-
snps-arc@xxxxxxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; linux-
oxnas@xxxxxxxxx; linux-csky@xxxxxxxxxxxxxxx; linux-hexagon@xxxxxxxxxxxxxxx;
linux-m68k@xxxxxxxxxxxxxxxxxxxx; linux-mips@xxxxxxxxxxxxxxx; linux-
openrisc@xxxxxxxxxxxxxxx; linux-parisc@xxxxxxxxxxxxxxx; linuxppc-
dev@xxxxxxxxxxxxxxxx; linux-riscv@xxxxxxxxxxxxxxxxxxx; linux-
sh@xxxxxxxxxxxxxxx; sparclinux@xxxxxxxxxxxxxxx; linux-xtensa@linux-
xtensa.org
Subject: [PATCH 21/21] dma-mapping: replace custom code with generic
implementation

From: Arnd Bergmann <arnd@xxxxxxxx>

Now that all of these have consistent behavior, replace them with a single
shared implementation of arch_sync_dma_for_device() and
arch_sync_dma_for_cpu() and three parameters to pick how they should
operate:

 - If the CPU has speculative prefetching, then the cache
   has to be invalidated after a transfer from the device.
   On the rarer CPUs without prefetching, this can be skipped,
   with all cache management happening before the transfer.
   This flag can be runtime detected, but is usually fixed
   per architecture.

 - Some architectures currently clean the caches before DMA
   from a device, while others invalidate it. There has not
   been a conclusion regarding whether we should change all
   architectures to use clean instead, so this adds an
   architecture specific flag that we can change later on.

 - On 32-bit Arm, the arch_sync_dma_for_cpu() function keeps
   track pages that are marked clean in the page cache, to
   avoid flushing them again. The implementation for this is
   generic enough to work on all architectures that use the
   PG_dcache_clean page flag, but a Kconfig symbol is used
   to only enable it on Arm to preserve the existing behavior.

For the function naming, I picked 'wback' over 'clean', and 'wback_inv'
over 'flush', to avoid any ambiguity of what the helper functions are
supposed to do.

Moving the global functions into a header file is usually a bad idea as it
prevents the header from being included more than once, but it helps keep
the behavior as close as possible to the previous state, including the
possibility of inlining most of it into these functions where that was done
before. This also helps keep the global namespace clean, by hiding the new
arch_dma_cache{_wback,_inv,_wback_inv} from device drivers that might use
them incorrectly.

It would be possible to do this one architecture at a time, but as the
change is the same everywhere, the combined patch helps explain it better
once.

Signed-off-by: Arnd Bergmann <arnd@xxxxxxxx>
---
 arch/arc/mm/dma.c                 |  66 +++++-------------
 arch/arm/Kconfig                  |   3 +
 arch/arm/mm/dma-mapping-nommu.c   |  39 ++++++-----
 arch/arm/mm/dma-mapping.c         |  64 +++++++-----------
 arch/arm64/mm/dma-mapping.c       |  28 +++++---
 arch/csky/mm/dma-mapping.c        |  44 ++++++------
 arch/hexagon/kernel/dma.c         |  44 ++++++------
 arch/m68k/kernel/dma.c            |  43 +++++++-----
 arch/microblaze/kernel/dma.c      |  48 +++++++-------
 arch/mips/mm/dma-noncoherent.c    |  60 +++++++----------
 arch/nios2/mm/dma-mapping.c       |  57 +++++++---------
 arch/openrisc/kernel/dma.c        |  63 +++++++++++-------
 arch/parisc/kernel/pci-dma.c      |  46 ++++++-------
 arch/powerpc/mm/dma-noncoherent.c |  34 ++++++----
 arch/riscv/mm/dma-noncoherent.c   |  51 +++++++-------
 arch/sh/kernel/dma-coherent.c     |  43 +++++++-----
 arch/sparc/kernel/ioport.c        |  38 ++++++++---
 arch/xtensa/kernel/pci-dma.c      |  40 ++++++-----
 include/linux/dma-sync.h          | 107 ++++++++++++++++++++++++++++++
 19 files changed, 527 insertions(+), 391 deletions(-)  create mode 100644
include/linux/dma-sync.h

diff --git a/arch/arc/mm/dma.c b/arch/arc/mm/dma.c index
ddb96786f765..61cd01646222 100644
--- a/arch/arc/mm/dma.c
+++ b/arch/arc/mm/dma.c
@@ -30,63 +30,33 @@ void arch_dma_prep_coherent(struct page *page, size_t
size)
      dma_cache_wback_inv(page_to_phys(page), size);  }

-/*
- * Cache operations depending on function and direction argument, inspired
by
- *
https://lore.kerne/
l.org%2Flkml%2F20180518175004.GF17671%40n2100.armlinux.org.uk&data=05%7C01%7
Cbiju.das.jz%40bp.renesas.com%7C3db9a66f29fa416d938108db2ebe1b0c%7C53d82571d
a1947e49cb4625a166a4a2a%7C0%7C0%7C638155166250292766%7CUnknown%7CTWFpbGZsb3d
8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
C%7C%7C&sdata=vVMW38elUoLyGW9%2BPQhsBDW8N61ubjgJBsbL6ct6uOU%3D&reserved=0
- * "dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20]
- * dma-mapping: provide a generic dma-noncoherent implementation)"
- *
- *          |   map          ==  for_device     |   unmap     ==  for_cpu
- *          |--------------------------------------------------------------
--
- * TO_DEV   |   writeback        writeback      |   none          none
- * FROM_DEV |   invalidate       invalidate     |   invalidate*
invalidate*
- * BIDIR    |   writeback        writeback      |   invalidate
invalidate
- *
- *     [*] needed for CPU speculative prefetches
- *
- * NOTE: we don't check the validity of direction argument as it is done in
- * upper layer functions (in include/linux/dma-mapping.h)
- */
-
-void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
 {
-     switch (dir) {
-     case DMA_TO_DEVICE:
-             dma_cache_wback(paddr, size);
-             break;
-
-     case DMA_FROM_DEVICE:
-             dma_cache_inv(paddr, size);
-             break;
-
-     case DMA_BIDIRECTIONAL:
-             dma_cache_wback(paddr, size);
-             break;
+     dma_cache_wback(paddr, size);
+}

-     default:
-             break;
-     }
+static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) {
+     dma_cache_inv(paddr, size);
 }

-void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t
+size)
 {
-     switch (dir) {
-     case DMA_TO_DEVICE:
-             break;
+     dma_cache_wback_inv(paddr, size);
+}

-     /* FROM_DEVICE invalidate needed if speculative CPU prefetch only */
-     case DMA_FROM_DEVICE:
-     case DMA_BIDIRECTIONAL:
-             dma_cache_inv(paddr, size);
-             break;
+static inline bool arch_sync_dma_clean_before_fromdevice(void)
+{
+     return false;
+}

-     default:
-             break;
-     }
+static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void)
+{
+     return true;
 }

+#include <linux/dma-sync.h>
+
 /*
  * Plug in direct dma map ops.
  */
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index
125d58c54ab1..0de84e861027 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -212,6 +212,9 @@ config LOCKDEP_SUPPORT
      bool
      default y

+config ARCH_DMA_MARK_DCACHE_CLEAN
+     def_bool y
+
 config ARCH_HAS_ILOG2_U32
      bool

diff --git a/arch/arm/mm/dma-mapping-nommu.c b/arch/arm/mm/dma-mapping-
nommu.c index 12b5c6ae93fc..0817274aed15 100644
--- a/arch/arm/mm/dma-mapping-nommu.c
+++ b/arch/arm/mm/dma-mapping-nommu.c
@@ -13,27 +13,36 @@

 #include "dma.h"

-void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
 {
-     if (dir == DMA_FROM_DEVICE) {
-             dmac_inv_range(__va(paddr), __va(paddr + size));
-             outer_inv_range(paddr, paddr + size);
-     } else {
-             dmac_clean_range(__va(paddr), __va(paddr + size));
-             outer_clean_range(paddr, paddr + size);
-     }
+     dmac_clean_range(__va(paddr), __va(paddr + size));
+     outer_clean_range(paddr, paddr + size);
 }

-void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size)
 {
-     if (dir != DMA_TO_DEVICE) {
-             outer_inv_range(paddr, paddr + size);
-             dmac_inv_range(__va(paddr), __va(paddr));
-     }
+     dmac_inv_range(__va(paddr), __va(paddr + size));
+     outer_inv_range(paddr, paddr + size);
 }

+static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t
+size) {
+     dmac_flush_range(__va(paddr), __va(paddr + size));
+     outer_flush_range(paddr, paddr + size); }
+
+static inline bool arch_sync_dma_clean_before_fromdevice(void)
+{
+     return false;
+}
+
+static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void)
+{
+     return true;
+}
+
+#include <linux/dma-sync.h>
+
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
                      const struct iommu_ops *iommu, bool coherent)  { diff --
git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index
b703cb83d27e..aa6ee820a0ab 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -687,6 +687,30 @@ void arch_dma_mark_clean(phys_addr_t paddr, size_t
size)
      }
 }

+static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
+{
+     dma_cache_maint(paddr, size, dmac_clean_range);
+     outer_clean_range(paddr, paddr + size); }
+
+
+static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) {
+     dma_cache_maint(paddr, size, dmac_inv_range);
+     outer_inv_range(paddr, paddr + size);
+}
+
+static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t
+size) {
+     dma_cache_maint(paddr, size, dmac_flush_range);
+     outer_flush_range(paddr, paddr + size); }
+
+static inline bool arch_sync_dma_clean_before_fromdevice(void)
+{
+     return false;
+}
+
 static bool arch_sync_dma_cpu_needs_post_dma_flush(void)
 {
      if (IS_ENABLED(CONFIG_CPU_V6) ||
@@ -699,45 +723,7 @@ static bool
arch_sync_dma_cpu_needs_post_dma_flush(void)
      return false;
 }

-/*
- * Make an area consistent for devices.
- * Note: Drivers should NOT use this function directly.
- * Use the driver DMA support - see dma-mapping.h (dma_sync_*)
- */
-void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
-{
-     switch (dir) {
-     case DMA_TO_DEVICE:
-             dma_cache_maint(paddr, size, dmac_clean_range);
-             outer_clean_range(paddr, paddr + size);
-             break;
-     case DMA_FROM_DEVICE:
-             dma_cache_maint(paddr, size, dmac_inv_range);
-             outer_inv_range(paddr, paddr + size);
-             break;
-     case DMA_BIDIRECTIONAL:
-             if (arch_sync_dma_cpu_needs_post_dma_flush()) {
-                     dma_cache_maint(paddr, size, dmac_clean_range);
-                     outer_clean_range(paddr, paddr + size);
-             } else {
-                     dma_cache_maint(paddr, size, dmac_flush_range);
-                     outer_flush_range(paddr, paddr + size);
-             }
-             break;
-     default:
-             break;
-     }
-}
-
-void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
-{
-     if (dir != DMA_TO_DEVICE && arch_sync_dma_cpu_needs_post_dma_flush())
{
-             outer_inv_range(paddr, paddr + size);
-             dma_cache_maint(paddr, size, dmac_inv_range);
-     }
-}
+#include <linux/dma-sync.h>

 #ifdef CONFIG_ARM_DMA_USE_IOMMU

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index
5240f6acad64..bae741aa65e9 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -13,25 +13,33 @@
 #include <asm/cacheflush.h>
 #include <asm/xen/xen-ops.h>

-void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
-                           enum dma_data_direction dir)
+static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
 {
-     unsigned long start = (unsigned long)phys_to_virt(paddr);
+     dcache_clean_poc(paddr, paddr + size); }

-     dcache_clean_poc(start, start + size);
+static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) {
+     dcache_inval_poc(paddr, paddr + size);
 }

-void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
-                        enum dma_data_direction dir)
+static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t
+size)
 {
-     unsigned long start = (unsigned long)phys_to_virt(paddr);
+     dcache_clean_inval_poc(paddr, paddr + size); }

-     if (dir == DMA_TO_DEVICE)
-             return;
+static inline bool arch_sync_dma_clean_before_fromdevice(void)
+{
+     return true;
+}

-     dcache_inval_poc(start, start + size);
+static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void)
+{
+     return true;
 }

+#include <linux/dma-sync.h>
+
 void arch_dma_prep_coherent(struct page *page, size_t size)  {
      unsigned long start = (unsigned long)page_address(page); diff --git
a/arch/csky/mm/dma-mapping.c b/arch/csky/mm/dma-mapping.c index
c90f912e2822..9402e101b363 100644
--- a/arch/csky/mm/dma-mapping.c
+++ b/arch/csky/mm/dma-mapping.c
@@ -55,31 +55,29 @@ void arch_dma_prep_coherent(struct page *page, size_t
size)
      cache_op(page_to_phys(page), size, dma_wbinv_set_zero_range);  }

-void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
 {
-     switch (dir) {
-     case DMA_TO_DEVICE:
-     case DMA_FROM_DEVICE:
-     case DMA_BIDIRECTIONAL:
-             cache_op(paddr, size, dma_wb_range);
-             break;
-     default:
-             BUG();
-     }
+     cache_op(paddr, size, dma_wb_range);
 }

-void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size)
 {
-     switch (dir) {
-     case DMA_TO_DEVICE:
-             return;
-     case DMA_FROM_DEVICE:
-     case DMA_BIDIRECTIONAL:
-             cache_op(paddr, size, dma_inv_range);
-             break;
-     default:
-             BUG();
-     }
+     cache_op(paddr, size, dma_inv_range);
 }
+
+static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t
+size) {
+     cache_op(paddr, size, dma_wbinv_range); }
+
+static inline bool arch_sync_dma_clean_before_fromdevice(void)
+{
+     return true;
+}
+
+static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void)
+{
+     return true;
+}
+
+#include <linux/dma-sync.h>
diff --git a/arch/hexagon/kernel/dma.c b/arch/hexagon/kernel/dma.c index
882680e81a30..e6538128a75b 100644
--- a/arch/hexagon/kernel/dma.c
+++ b/arch/hexagon/kernel/dma.c
@@ -9,29 +9,33 @@
 #include <linux/memblock.h>
 #include <asm/page.h>

-void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
 {
-     void *addr = phys_to_virt(paddr);
-
-     switch (dir) {
-     case DMA_TO_DEVICE:
-             hexagon_clean_dcache_range((unsigned long) addr,
-             (unsigned long) addr + size);
-             break;
-     case DMA_FROM_DEVICE:
-             hexagon_inv_dcache_range((unsigned long) addr,
-             (unsigned long) addr + size);
-             break;
-     case DMA_BIDIRECTIONAL:
-             flush_dcache_range((unsigned long) addr,
-             (unsigned long) addr + size);
-             break;
-     default:
-             BUG();
-     }
+     hexagon_clean_dcache_range(paddr, paddr + size);
 }

+static inline void arch_dma_cache_inv(phys_addr_t start, size_t size) {
+     hexagon_inv_dcache_range(paddr, paddr + size); }
+
+static inline void arch_dma_cache_wback_inv(phys_addr_t start, size_t
+size) {
+     hexagon_flush_dcache_range(paddr, paddr + size); }
+
+static inline bool arch_sync_dma_clean_before_fromdevice(void)
+{
+     return false;
+}
+
+static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void)
+{
+     return false;
+}
+
+#include <linux/dma-sync.h>
+
 /*
  * Our max_low_pfn should have been backed off by 16MB in mm/init.c to
create
  * DMA coherent space.  Use that for the pool.
diff --git a/arch/m68k/kernel/dma.c b/arch/m68k/kernel/dma.c index
2e192a5df949..aa9b434e6df8 100644
--- a/arch/m68k/kernel/dma.c
+++ b/arch/m68k/kernel/dma.c
@@ -58,20 +58,33 @@ void arch_dma_free(struct device *dev, size_t size, void
*vaddr,

 #endif /* CONFIG_MMU && !CONFIG_COLDFIRE */

-void arch_sync_dma_for_device(phys_addr_t handle, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
 {
-     switch (dir) {
-     case DMA_BIDIRECTIONAL:
-     case DMA_TO_DEVICE:
-             cache_push(handle, size);
-             break;
-     case DMA_FROM_DEVICE:
-             cache_clear(handle, size);
-             break;
-     default:
-             pr_err_ratelimited("dma_sync_single_for_device: unsupported dir
%u\n",
-                                dir);
-             break;
-     }
+     /*
+      * cache_push() always invalidates in addition to cleaning
+      * write-back caches.
+      */
+     cache_push(paddr, size);
+}
+
+static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) {
+     cache_clear(paddr, size);
+}
+
+static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t
+size) {
+     cache_push(paddr, size);
 }
+
+static inline bool arch_sync_dma_clean_before_fromdevice(void)
+{
+     return false;
+}
+
+static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void)
+{
+     return false;
+}
+
+#include <linux/dma-sync.h>
diff --git a/arch/microblaze/kernel/dma.c b/arch/microblaze/kernel/dma.c
index b4c4e45fd45e..01110d4aa5b0 100644
--- a/arch/microblaze/kernel/dma.c
+++ b/arch/microblaze/kernel/dma.c
@@ -14,32 +14,30 @@
 #include <linux/bug.h>
 #include <asm/cacheflush.h>

-void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
 {
-     switch (direction) {
-     case DMA_TO_DEVICE:
-     case DMA_BIDIRECTIONAL:
-             flush_dcache_range(paddr, paddr + size);
-             break;
-     case DMA_FROM_DEVICE:
-             invalidate_dcache_range(paddr, paddr + size);
-             break;
-     default:
-             BUG();
-     }
+     /* writeback plus invalidate, could be a nop on WT caches */
+     flush_dcache_range(paddr, paddr + size);
 }

-void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size)
 {
-     switch (direction) {
-     case DMA_TO_DEVICE:
-             break;
-     case DMA_BIDIRECTIONAL:
-     case DMA_FROM_DEVICE:
-             invalidate_dcache_range(paddr, paddr + size);
-             break;
-     default:
-             BUG();
-     }}
+     invalidate_dcache_range(paddr, paddr + size); }
+
+static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t
+size) {
+     flush_dcache_range(paddr, paddr + size); }
+
+static inline bool arch_sync_dma_clean_before_fromdevice(void)
+{
+     return false;
+}
+
+static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void)
+{
+     return true;
+}
+
+#include <linux/dma-sync.h>
diff --git a/arch/mips/mm/dma-noncoherent.c b/arch/mips/mm/dma-noncoherent.c
index b9d68bcc5d53..902d4b7c1f85 100644
--- a/arch/mips/mm/dma-noncoherent.c
+++ b/arch/mips/mm/dma-noncoherent.c
@@ -85,50 +85,38 @@ static inline void dma_sync_phys(phys_addr_t paddr,
size_t size,
      } while (left);
 }

-void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
 {
-     switch (dir) {
-     case DMA_TO_DEVICE:
-             dma_sync_phys(paddr, size, _dma_cache_wback);
-             break;
-     case DMA_FROM_DEVICE:
-             dma_sync_phys(paddr, size, _dma_cache_inv);
-             break;
-     case DMA_BIDIRECTIONAL:
-             if (IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) &&
-                 cpu_needs_post_dma_flush())
-                     dma_sync_phys(paddr, size, _dma_cache_wback);
-             else
-                     dma_sync_phys(paddr, size, _dma_cache_wback_inv);
-             break;
-     default:
-             break;
-     }
+     dma_sync_phys(paddr, size, _dma_cache_wback);
 }

-#ifdef CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU -void
arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size)
 {
-     switch (dir) {
-     case DMA_TO_DEVICE:
-             break;
-     case DMA_FROM_DEVICE:
-     case DMA_BIDIRECTIONAL:
-             if (cpu_needs_post_dma_flush())
-                     dma_sync_phys(paddr, size, _dma_cache_inv);
-             break;
-     default:
-             break;
-     }
+     dma_sync_phys(paddr, size, _dma_cache_inv);
 }
-#endif
+
+static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t
+size) {
+     dma_sync_phys(paddr, size, _dma_cache_wback_inv); }
+
+static inline bool arch_sync_dma_clean_before_fromdevice(void)
+{
+     return false;
+}
+
+static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void)
+{
+     return IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) &&
+                    cpu_needs_post_dma_flush(); }
+
+#include <linux/dma-sync.h>

 #ifdef CONFIG_ARCH_HAS_SETUP_DMA_OPS
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-             const struct iommu_ops *iommu, bool coherent)
+               const struct iommu_ops *iommu, bool coherent)
 {
-     dev->dma_coherent = coherent;
+       dev->dma_coherent = coherent;
 }
 #endif
diff --git a/arch/nios2/mm/dma-mapping.c b/arch/nios2/mm/dma-mapping.c index
fd887d5f3f9a..29978970955e 100644
--- a/arch/nios2/mm/dma-mapping.c
+++ b/arch/nios2/mm/dma-mapping.c
@@ -13,53 +13,46 @@
 #include <linux/types.h>
 #include <linux/mm.h>
 #include <linux/string.h>
+#include <linux/dma-map-ops.h>
 #include <linux/dma-mapping.h>
 #include <linux/io.h>
 #include <linux/cache.h>
 #include <asm/cacheflush.h>

-void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
 {
+     /*
+      * We just need to write back the caches here, but Nios2 flush
+      * instruction will do both writeback and invalidate.
+      */
      void *vaddr = phys_to_virt(paddr);
+     flush_dcache_range((unsigned long)vaddr, (unsigned long)(vaddr +
+size)); }

-     switch (dir) {
-     case DMA_FROM_DEVICE:
-             invalidate_dcache_range((unsigned long)vaddr,
-                     (unsigned long)(vaddr + size));
-             break;
-     case DMA_TO_DEVICE:
-             /*
-              * We just need to flush the caches here , but Nios2 flush
-              * instruction will do both writeback and invalidate.
-              */
-     case DMA_BIDIRECTIONAL: /* flush and invalidate */
-             flush_dcache_range((unsigned long)vaddr,
-                     (unsigned long)(vaddr + size));
-             break;
-     default:
-             BUG();
-     }
+static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) {
+     unsigned long vaddr = (unsigned long)phys_to_virt(paddr);
+     invalidate_dcache_range(vaddr, (unsigned long)(vaddr + size));
 }

-void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t
+size)
 {
      void *vaddr = phys_to_virt(paddr);
+     flush_dcache_range((unsigned long)vaddr, (unsigned long)(vaddr +
+size)); }
+
+static inline bool arch_sync_dma_clean_before_fromdevice(void)
+{
+     return false;
+}

-     switch (dir) {
-     case DMA_BIDIRECTIONAL:
-     case DMA_FROM_DEVICE:
-             invalidate_dcache_range((unsigned long)vaddr,
-                     (unsigned long)(vaddr + size));
-             break;
-     case DMA_TO_DEVICE:
-             break;
-     default:
-             BUG();
-     }
+static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void)
+{
+     return true;
 }

+#include <linux/dma-sync.h>
+
 void arch_dma_prep_coherent(struct page *page, size_t size)  {
      unsigned long start = (unsigned long)page_address(page); diff --git
a/arch/openrisc/kernel/dma.c b/arch/openrisc/kernel/dma.c index
91a00d09ffad..aba2258e62eb 100644
--- a/arch/openrisc/kernel/dma.c
+++ b/arch/openrisc/kernel/dma.c
@@ -95,32 +95,47 @@ void arch_dma_clear_uncached(void *cpu_addr, size_t
size)
      mmap_write_unlock(&init_mm);
 }

-void arch_sync_dma_for_device(phys_addr_t addr, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
 {
      unsigned long cl;
      struct cpuinfo_or1k *cpuinfo = &cpuinfo_or1k[smp_processor_id()];

-     switch (dir) {
-     case DMA_TO_DEVICE:
-             /* Write back the dcache for the requested range */
-             for (cl = addr; cl < addr + size;
-                  cl += cpuinfo->dcache_block_size)
-                     mtspr(SPR_DCBWR, cl);
-             break;
-     case DMA_FROM_DEVICE:
-             /* Invalidate the dcache for the requested range */
-             for (cl = addr; cl < addr + size;
-                  cl += cpuinfo->dcache_block_size)
-                     mtspr(SPR_DCBIR, cl);
-             break;
-     case DMA_BIDIRECTIONAL:
-             /* Flush the dcache for the requested range */
-             for (cl = addr; cl < addr + size;
-                  cl += cpuinfo->dcache_block_size)
-                     mtspr(SPR_DCBFR, cl);
-             break;
-     default:
-             break;
-     }
+     /* Write back the dcache for the requested range */
+     for (cl = paddr; cl < paddr + size;
+          cl += cpuinfo->dcache_block_size)
+             mtspr(SPR_DCBWR, cl);
 }
+
+static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) {
+     unsigned long cl;
+     struct cpuinfo_or1k *cpuinfo = &cpuinfo_or1k[smp_processor_id()];
+
+     /* Invalidate the dcache for the requested range */
+     for (cl = paddr; cl < paddr + size;
+          cl += cpuinfo->dcache_block_size)
+             mtspr(SPR_DCBIR, cl);
+}
+
+static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t
+size) {
+     unsigned long cl;
+     struct cpuinfo_or1k *cpuinfo = &cpuinfo_or1k[smp_processor_id()];
+
+     /* Flush the dcache for the requested range */
+     for (cl = paddr; cl < paddr + size;
+          cl += cpuinfo->dcache_block_size)
+             mtspr(SPR_DCBFR, cl);
+}
+
+static inline bool arch_sync_dma_clean_before_fromdevice(void)
+{
+     return false;
+}
+
+static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void)
+{
+     return false;
+}
+
+#include <linux/dma-sync.h>
diff --git a/arch/parisc/kernel/pci-dma.c b/arch/parisc/kernel/pci-dma.c
index 6d3d3cffb316..a7955aab8ce2 100644
--- a/arch/parisc/kernel/pci-dma.c
+++ b/arch/parisc/kernel/pci-dma.c
@@ -443,35 +443,35 @@ void arch_dma_free(struct device *dev, size_t size,
void *vaddr,
      free_pages((unsigned long)__va(dma_handle), order);  }

-void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
 {
      unsigned long virt = (unsigned long)phys_to_virt(paddr);

-     switch (dir) {
-     case DMA_TO_DEVICE:
-             clean_kernel_dcache_range(virt, size);
-             break;
-     case DMA_FROM_DEVICE:
-             clean_kernel_dcache_range(virt, size);
-             break;
-     case DMA_BIDIRECTIONAL:
-             flush_kernel_dcache_range(virt, size);
-             break;
-     }
+     clean_kernel_dcache_range(virt, size);
 }

-void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size)
 {
      unsigned long virt = (unsigned long)phys_to_virt(paddr);

-     switch (dir) {
-     case DMA_TO_DEVICE:
-             break;
-     case DMA_FROM_DEVICE:
-     case DMA_BIDIRECTIONAL:
-             purge_kernel_dcache_range(virt, size);
-             break;
-     }
+     purge_kernel_dcache_range(virt, size); }
+
+static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t
+size) {
+     unsigned long virt = (unsigned long)phys_to_virt(paddr);
+
+     flush_kernel_dcache_range(virt, size);
 }
+
+static inline bool arch_sync_dma_clean_before_fromdevice(void)
+{
+     return true;
+}
+
+static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void)
+{
+     return true;
+}
+
+#include <linux/dma-sync.h>
diff --git a/arch/powerpc/mm/dma-noncoherent.c b/arch/powerpc/mm/dma-
noncoherent.c
index 00e59a4faa2b..268510c71156 100644
--- a/arch/powerpc/mm/dma-noncoherent.c
+++ b/arch/powerpc/mm/dma-noncoherent.c
@@ -101,27 +101,33 @@ static void __dma_phys_op(phys_addr_t paddr, size_t
size, enum dma_cache_op op)  #endif  }

-void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
 {
      __dma_phys_op(start, end, DMA_CACHE_CLEAN);  }

-void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size)
 {
-     switch (direction) {
-     case DMA_NONE:
-             BUG();
-     case DMA_TO_DEVICE:
-             break;
-     case DMA_FROM_DEVICE:
-     case DMA_BIDIRECTIONAL:
-             __dma_phys_op(start, end, DMA_CACHE_INVAL);
-             break;
-     }
+     __dma_phys_op(start, end, DMA_CACHE_INVAL);
 }

+static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t
+size) {
+     __dma_phys_op(start, end, DMA_CACHE_FLUSH); }
+
+static inline bool arch_sync_dma_clean_before_fromdevice(void)
+{
+     return true;
+}
+
+static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void)
+{
+     return true;
+}
+
+#include <linux/dma-sync.h>
+
 void arch_dma_prep_coherent(struct page *page, size_t size)  {
      unsigned long kaddr = (unsigned long)page_address(page); diff --git
a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c index
69c80b2155a1..b9a9f57e02be 100644
--- a/arch/riscv/mm/dma-noncoherent.c
+++ b/arch/riscv/mm/dma-noncoherent.c
@@ -12,43 +12,40 @@

 static bool noncoherent_supported;

-void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
-                           enum dma_data_direction dir)
+static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
 {
      void *vaddr = phys_to_virt(paddr);

-     switch (dir) {
-     case DMA_TO_DEVICE:
-             ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
-             break;
-     case DMA_FROM_DEVICE:
-             ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
-             break;
-     case DMA_BIDIRECTIONAL:
-             ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
-             break;
-     default:
-             break;
-     }
+     ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
 }

-void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
-                        enum dma_data_direction dir)
+static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size)
 {
      void *vaddr = phys_to_virt(paddr);

-     switch (dir) {
-     case DMA_TO_DEVICE:
-             break;
-     case DMA_FROM_DEVICE:
-     case DMA_BIDIRECTIONAL:
-             ALT_CMO_OP(inval, vaddr, size, riscv_cbom_block_size);
-             break;
-     default:
-             break;
-     }
+     ALT_CMO_OP(inval, vaddr, size, riscv_cbom_block_size);
 }

+static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t
+size) {
+     void *vaddr = phys_to_virt(paddr);
+
+     ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); }
+
+static inline bool arch_sync_dma_clean_before_fromdevice(void)
+{
+     return true;
+}
+
+static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void)
+{
+     return true;
+}
+
+#include <linux/dma-sync.h>
+
+
 void arch_dma_prep_coherent(struct page *page, size_t size)  {
      void *flush_addr = page_address(page); diff --git
a/arch/sh/kernel/dma-coherent.c b/arch/sh/kernel/dma-coherent.c index
6a44c0e7ba40..41f031ae7609 100644
--- a/arch/sh/kernel/dma-coherent.c
+++ b/arch/sh/kernel/dma-coherent.c
@@ -12,22 +12,35 @@ void arch_dma_prep_coherent(struct page *page, size_t
size)
      __flush_purge_region(page_address(page), size);  }

-void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
 {
      void *addr = sh_cacheop_vaddr(phys_to_virt(paddr));

-     switch (dir) {
-     case DMA_FROM_DEVICE:           /* invalidate only */
-             __flush_invalidate_region(addr, size);
-             break;
-     case DMA_TO_DEVICE:             /* writeback only */
-             __flush_wback_region(addr, size);
-             break;
-     case DMA_BIDIRECTIONAL:         /* writeback and invalidate */
-             __flush_purge_region(addr, size);
-             break;
-     default:
-             BUG();
-     }
+     __flush_wback_region(addr, size);
 }
+
+static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) {
+     void *addr = sh_cacheop_vaddr(phys_to_virt(paddr));
+
+     __flush_invalidate_region(addr, size); }
+
+static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t
+size) {
+     void *addr = sh_cacheop_vaddr(phys_to_virt(paddr));
+
+     __flush_purge_region(addr, size);
+}
+
+static inline bool arch_sync_dma_clean_before_fromdevice(void)
+{
+     return false;
+}
+
+static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void)
+{
+     return false;
+}
+
+#include <linux/dma-sync.h>
diff --git a/arch/sparc/kernel/ioport.c b/arch/sparc/kernel/ioport.c index
4f3d26066ec2..6926ead2f208 100644
--- a/arch/sparc/kernel/ioport.c
+++ b/arch/sparc/kernel/ioport.c
@@ -300,21 +300,39 @@ arch_initcall(sparc_register_ioport);

 #endif /* CONFIG_SBUS */

-/*
- * IIep is write-through, not flushing on cpu to device transfer.
- *
- * On LEON systems without cache snooping, the entire D-CACHE must be
flushed to
- * make DMA to cacheable memory coherent.
- */
-void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
 {
-     if (dir != DMA_TO_DEVICE &&
-         sparc_cpu_model == sparc_leon &&
+     /* IIep is write-through, not flushing on cpu to device transfer. */ }
+
+static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) {
+     /*
+      * On LEON systems without cache snooping, the entire D-CACHE must be
+      * flushed to make DMA to cacheable memory coherent.
+      */
+     if (sparc_cpu_model == sparc_leon &&
          !sparc_leon3_snooping_enabled())
              leon_flush_dcache_all();
 }

+static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t
+size) {
+     arch_dma_cache_inv(paddr, size);
+}
+
+static inline bool arch_sync_dma_clean_before_fromdevice(void)
+{
+     return true;
+}
+
+static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void)
+{
+     return false;
+}
+
+#include <linux/dma-sync.h>
+
 #ifdef CONFIG_PROC_FS

 static int sparc_io_proc_show(struct seq_file *m, void *v) diff --git
a/arch/xtensa/kernel/pci-dma.c b/arch/xtensa/kernel/pci-dma.c index
ff3bf015eca4..d4ff96585545 100644
--- a/arch/xtensa/kernel/pci-dma.c
+++ b/arch/xtensa/kernel/pci-dma.c
@@ -43,24 +43,34 @@ static void do_cache_op(phys_addr_t paddr, size_t size,
              }
 }

-void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
-             enum dma_data_direction dir)
+static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
 {
-     switch (dir) {
-     case DMA_TO_DEVICE:
-             do_cache_op(paddr, size, __flush_dcache_range);
-             break;
-     case DMA_FROM_DEVICE:
-             do_cache_op(paddr, size, __invalidate_dcache_range);
-             break;
-     case DMA_BIDIRECTIONAL:
-             do_cache_op(paddr, size, __flush_invalidate_dcache_range);
-             break;
-     default:
-             break;
-     }
+     do_cache_op(paddr, size, __flush_dcache_range);
 }

+static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size) {
+     do_cache_op(paddr, size, __invalidate_dcache_range); }
+
+static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t
+size) {
+     do_cache_op(paddr, size, __flush_invalidate_dcache_range); }
+
+static inline bool arch_sync_dma_clean_before_fromdevice(void)
+{
+     return false;
+}
+
+static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void)
+{
+     return false;
+}
+
+#include <linux/dma-sync.h>
+
+
 void arch_dma_prep_coherent(struct page *page, size_t size)  {
      __invalidate_dcache_range((unsigned long)page_address(page), size);
diff --git a/include/linux/dma-sync.h b/include/linux/dma-sync.h new file
mode 100644 index 000000000000..18e33d5e8eaf
--- /dev/null
+++ b/include/linux/dma-sync.h
@@ -0,0 +1,107 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Cache operations depending on function and direction argument,
+inspired by
+ *
+https://lore/.
+kernel.org%2Flkml%2F20180518175004.GF17671%40n2100.armlinux.org.uk&data
+=05%7C01%7Cbiju.das.jz%40bp.renesas.com%7C3db9a66f29fa416d938108db2ebe1
+b0c%7C53d82571da1947e49cb4625a166a4a2a%7C0%7C0%7C638155166250449286%7CU
+nknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haW
+wiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=04qDpyhP%2FT1wdPjg%2Bi0EzLz815rk
+8AJmZFv8tq7tolM%3D&reserved=0
+ * "dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20]
+ * dma-mapping: provide a generic dma-noncoherent implementation)"
+ *
+ *          |   map          ==  for_device     |   unmap     ==  for_cpu
+ *          |--------------------------------------------------------------
--
+ * TO_DEV   |   writeback        writeback      |   none          none
+ * FROM_DEV |   invalidate       invalidate     |   invalidate*
invalidate*
+ * BIDIR    |   writeback        writeback      |   invalidate
invalidate
+ *
+ *     [*] needed for CPU speculative prefetches
+ *
+ * NOTE: we don't check the validity of direction argument as it is
+done in
+ * upper layer functions (in include/linux/dma-mapping.h)
+ *
+ * This file can be included by arch/.../kernel/dma-noncoherent.c to
+provide
+ * the respective high-level operations without having to expose the
+ * cache management ops to drivers.
+ */
+
+void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
+             enum dma_data_direction dir)
+{
+     switch (dir) {
+     case DMA_TO_DEVICE:
+             /*
+              * This may be an empty function on write-through caches,
+              * and it might invalidate the cache if an architecture has
+              * a write-back cache but no way to write it back without
+              * invalidating
+              */
+             arch_dma_cache_wback(paddr, size);
+             break;
+
+     case DMA_FROM_DEVICE:
+             /*
+              * FIXME: this should be handled the same across all
+              * architectures, see
+              *
https://lore.kerne/
l.org%2Fall%2F20220606152150.GA31568%40willie-the-
truck%2F&data=05%7C01%7Cbiju.das.jz%40bp.renesas.com%7C3db9a66f29fa416d93810
8db2ebe1b0c%7C53d82571da1947e49cb4625a166a4a2a%7C0%7C0%7C638155166250449286%
7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwi
LCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=rMRR1qB7VTNcvosS73f04WZ5BI46kEoZXj4sTXl
Sbf8%3D&reserved=0
+              */
+             if (!arch_sync_dma_clean_before_fromdevice()) {
+                     arch_dma_cache_inv(paddr, size);
+                     break;
+             }
+             fallthrough;
+
+     case DMA_BIDIRECTIONAL:
+             /* Skip the invalidate here if it's done later */
+             if (IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) &&
+                 arch_sync_dma_cpu_needs_post_dma_flush())
+                     arch_dma_cache_wback(paddr, size);
+             else
+                     arch_dma_cache_wback_inv(paddr, size);
+             break;
+
+     default:
+             break;
+     }
+}
+
+#ifdef CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU
+/*
+ * Mark the D-cache clean for these pages to avoid extra flushing.
+ */
+static void arch_dma_mark_dcache_clean(phys_addr_t paddr, size_t size)
+{ #ifdef CONFIG_ARCH_DMA_MARK_DCACHE_CLEAN
+     unsigned long pfn = PFN_UP(paddr);
+     unsigned long off = paddr & (PAGE_SIZE - 1);
+     size_t left = size;
+
+     if (off)
+             left -= PAGE_SIZE - off;
+
+     while (left >= PAGE_SIZE) {
+             struct page *page = pfn_to_page(pfn++);
+             set_bit(PG_dcache_clean, &page->flags);
+             left -= PAGE_SIZE;
+     }
+#endif
+}
+
+void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
+             enum dma_data_direction dir)
+{
+     switch (dir) {
+     case DMA_TO_DEVICE:
+             break;
+
+     case DMA_FROM_DEVICE:
+     case DMA_BIDIRECTIONAL:
+             /* FROM_DEVICE invalidate needed if speculative CPU prefetch
only */
+             if (arch_sync_dma_cpu_needs_post_dma_flush())
+                     arch_dma_cache_inv(paddr, size);
+
+             if (size > PAGE_SIZE)
+                     arch_dma_mark_dcache_clean(paddr, size);
+             break;
+
+     default:
+             break;
+     }
+}
+#endif
--
2.39.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
http://lists.infra/
dead.org%2Fmailman%2Flistinfo%2Flinux-arm-
kernel&data=05%7C01%7Cbiju.das.jz%40bp.renesas.com%7C3db9a66f29fa416d938108d
b2ebe1b0c%7C53d82571da1947e49cb4625a166a4a2a%7C0%7C0%7C638155166250449286%7C
Unknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLC
JXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=jVWHs4FyF3gf99YGax4jl1vHNQ7JFMbsX3NoIAHdw
Zw%3D&reserved=0




[Index of Archives]     [Video for Linux]     [Yosemite News]     [Linux S/390]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux