Hi Ahmad, On Mon, May 17, 2021 at 12:42:21PM +0200, Ahmad Fatoum wrote: > Hello Jules, Yann, > > On 05.03.21 19:33, Jules Maselbas wrote: > > From: Yann Sionneau <ysionneau@xxxxxxxxx> > > > > Signed-off-by: Jules Maselbas <jmaselbas@xxxxxxxxx> > > [snip] > > > diff --git a/arch/kvx/lib/dma-default.c b/arch/kvx/lib/dma-default.c > > new file mode 100644 > > index 000000000..2a4144696 > > --- /dev/null > > +++ b/arch/kvx/lib/dma-default.c > > @@ -0,0 +1,94 @@ > > +// SPDX-License-Identifier: GPL-2.0-only > > +// SPDX-FileCopyrightText: 2021 Yann Sionneau <ysionneau@xxxxxxxxx>, Kalray Inc. > > + > > +#include <dma.h> > > +#include <asm/barrier.h> > > +#include <asm/io.h> > > +#include <asm/cache.h> > > +#include <asm/sfr.h> > > +#include <asm/sys_arch.h> > > + > > +/* > > + * The implementation of arch should follow the following rules: > > + * map for_cpu for_device unmap > > + * TO_DEV writeback none writeback none > > + * FROM_DEV invalidate invalidate(*) invalidate invalidate(*) > > + * BIDIR writeback invalidate writeback invalidate > > + * > > + * (*) - only necessary if the CPU speculatively prefetches. > > + * > > + * (see https://lkml.org/lkml/2018/5/18/979) > > + */ > > + > > +void dma_sync_single_for_device(dma_addr_t addr, size_t size, > > + enum dma_data_direction dir) > > +{ > > + /* dcache is Write-Through: no need to flush to force writeback */ > > + switch (dir) { > > + case DMA_FROM_DEVICE: > > + invalidate_dcache_range(addr, addr + size); > > + break; > > + case DMA_TO_DEVICE: > > + case DMA_BIDIRECTIONAL: > > + /* allow device to read buffer written by CPU */ > > + wmb(); > > + break; > > + default: > > + BUG(); > > + } > > +} > > + > > +void dma_sync_single_for_cpu(dma_addr_t addr, size_t size, > > + enum dma_data_direction dir) > > +{ > > + /* CPU does not speculatively prefetches */ > > + switch (dir) { > > + case DMA_FROM_DEVICE: > > + /* invalidate has been done during map/for_device */ > > + case DMA_TO_DEVICE: > > + break; > > + case DMA_BIDIRECTIONAL: > > + invalidate_dcache_range(addr, addr + size); > > + break; > > + default: > > + BUG(); > > + } > > +} > > Both of these work on CPU pointers. > > > + > > +#define KVX_DDR_ALIAS_OFFSET \ > > + (KVX_DDR_64BIT_RAM_WINDOW_BA - KVX_DDR_32BIT_RAM_WINDOW_BA) > > +#define KVX_DDR_ALIAS_WINDOW \ > > + (KVX_DDR_64BIT_RAM_WINDOW_BA + KVX_DDR_ALIAS_OFFSET) > > + > > +/* Local smem is aliased between 0 and 16MB */ > > +#define KVX_SMEM_LOCAL_ALIAS 0x1000000ULL > > + > > +dma_addr_t dma_map_single(struct device_d *dev, void *ptr, size_t size, > > + enum dma_data_direction dir) > > +{ > > + uintptr_t addr = (uintptr_t) ptr; > > + > > + dma_sync_single_for_device(addr, size, dir); > > So this is correct. > > > + > > + /* Local smem alias should never be used for dma */ > > + if (addr < KVX_SMEM_LOCAL_ALIAS) > > + return addr + (1 + kvx_cluster_id()) * KVX_SMEM_LOCAL_ALIAS; > > + > > + if (dev->dma_mask && addr <= dev->dma_mask) > > + return addr; > > + > > + if (addr >= KVX_DDR_ALIAS_WINDOW) > > + return DMA_ERROR_CODE; > > + > > + addr -= KVX_DDR_ALIAS_OFFSET; > > + if (dev->dma_mask && addr > dev->dma_mask) > > + return DMA_ERROR_CODE; > > + > > + return addr; > > +} > > Now you compute a dma_addr_t as CPU pointer - KVX_DDR_ALIAS_OFFSET. > > > + > > +void dma_unmap_single(struct device_d *dev, dma_addr_t addr, size_t size, > > + enum dma_data_direction dir) > > +{ > > + dma_sync_single_for_cpu(addr, size, dir); > > And then that dma_addr_t is passed here without + KVX_DDR_ALIAS_OFFSET > to get a CPU pointer out. > So for DMA_BIDIRECTIONAL you'd invalidate the wrong cache region. Yes, this is bogus but I believe in our case dma_unmap_single is never called with DMA_BIDIRECTIONAL. > > I stumbled upon this, because I noticed that that kvx whould've failed to > build starting with v2021.04.0, because following commits conflict with each other: > 3f975f810bd3 ("dma: move dma_map/unmap_single from ARM to common code") commited on 2021-03-03 > 4808d6f80073 ("kvx: Implement dma handling primitives") commited on 2021-03-05 > > Now, dma_map_single() is defined twice for kvx. Yes, I noticed this as well when rebasing on barebox v2021.04. > As dma_(un?)map_single is always called with a dev argument, couldn't you define the > DMA offsets in the device tree and use the common drivers/dma/map.c implementation > for these two functions? Our custom implementation will be removed in favor of the common implementation and the uses of dma-range. I had to add a dma-range to the dwc2 device nodes in our device tree, so when the driver does dma_map/unmap the common implementation will do the correct remap to the aliased region (so addresses fits in 32bit). However I do not know how Linux handle a mix of dma-range and iommu. Since I didn't had much time to test this we are still using v2021.03. Thanks, Jules _______________________________________________ barebox mailing list barebox@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/barebox