On Fri, May 12, 2023 at 08:30:01PM +0200, Ahmad Fatoum wrote: > On 12.05.23 13:09, Sascha Hauer wrote: > > The next patch merges the mmu.c files with their corresponding > > mmu-early.c files. Before doing that move functions which can't > > be compiled for PBL out to extra files. > > > > Signed-off-by: Sascha Hauer <s.hauer@xxxxxxxxxxxxxx> > > --- > > arch/arm/cpu/Makefile | 1 + > > arch/arm/cpu/dma_32.c | 20 ++++++++++++++++++++ > > arch/arm/cpu/dma_64.c | 16 ++++++++++++++++ > > arch/arm/cpu/mmu_32.c | 18 ------------------ > > arch/arm/cpu/mmu_64.c | 13 ------------- > > 5 files changed, 37 insertions(+), 31 deletions(-) > > create mode 100644 arch/arm/cpu/dma_32.c > > create mode 100644 arch/arm/cpu/dma_64.c > > > > diff --git a/arch/arm/cpu/Makefile b/arch/arm/cpu/Makefile > > index fef2026da5..cd5f36eb49 100644 > > --- a/arch/arm/cpu/Makefile > > +++ b/arch/arm/cpu/Makefile > > @@ -4,6 +4,7 @@ obj-y += cpu.o > > > > obj-$(CONFIG_ARM_EXCEPTIONS) += exceptions_$(S64_32).o interrupts_$(S64_32).o > > obj-$(CONFIG_MMU) += mmu_$(S64_32).o mmu-common.o > > +obj-$(CONFIG_MMU) += dma_$(S64_32).o > > obj-pbl-y += lowlevel_$(S64_32).o > > obj-pbl-$(CONFIG_MMU) += mmu-early_$(S64_32).o > > obj-pbl-$(CONFIG_CPU_32v7) += hyp.o > > diff --git a/arch/arm/cpu/dma_32.c b/arch/arm/cpu/dma_32.c > > new file mode 100644 > > index 0000000000..a66aa26b9b > > --- /dev/null > > +++ b/arch/arm/cpu/dma_32.c > > @@ -0,0 +1,20 @@ > > +#include <dma.h> > > +#include <asm/mmu.h> > > + > > +void dma_sync_single_for_device(dma_addr_t address, size_t size, > > + enum dma_data_direction dir) > > +{ > > + /* > > + * FIXME: This function needs a device argument to support non 1:1 mappings > > + */ > > + > > + if (dir == DMA_FROM_DEVICE) { > > + __dma_inv_range(address, address + size); > > + if (outer_cache.inv_range) > > + outer_cache.inv_range(address, address + size); > > I know this is unrelated to your series, but this is wrong. The outermost > cache must be be invalidated before L1. Otherwise we could have this > unlucky constellation: > > - CPU is invalidating L1 > - HW prefetcher wants to load something into L1 > - Stale data in L2 is loaded into L1 > - Only now CPU invalidates L2 L1 is invalidated after the DMA transfer in dma_sync_single_for_cpu(), so stale data in L1 shouldn't be a problem. However, the prefetcher could cause stale entries in L2 during the DMA transfer, so we have to invalidate that as well after the transfer. Sascha -- Pengutronix e.K. | | Steuerwalder Str. 21 | http://www.pengutronix.de/ | 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |