Hi Jules, Am Dienstag, dem 02.03.2021 um 11:58 +0100 schrieb Jules Maselbas: > Hi Lucas and Ahmad, > > On Tue, Mar 02, 2021 at 11:14:09AM +0100, Lucas Stach wrote: > > Am Dienstag, dem 02.03.2021 um 09:37 +0100 schrieb Ahmad Fatoum: > > > Hello Jules, Yann, > > > > > > On 01.03.21 16:58, Jules Maselbas wrote: > > > > From: Yann Sionneau <ysionneau@xxxxxxxxx> > > > Some comments inline. I am not a cache cohereny expert, so take > > > it with a grain of salt. > > > > > > > +static inline void *dma_alloc_coherent(size_t size, dma_addr_t *dma_handle) > > > > +{ > > > > + void *ret = xmemalign(PAGE_SIZE, size); > > > > + > > > > + if (dma_handle) > > > > + *dma_handle = (dma_addr_t)(uintptr_t)ret; > > > > + > > > > + return ret; > > > > +} > > > > > > This would imply that the CPU barebox is booting is coherent with all > > > > > > devices that barebox needs to access. Is that the case? > > > > > > (See below) > > > > This is bogus, memory is not coherent with all devices, this should be > handled by the mmu, which is currently not supported in our barebox port. > Using this can lead to coherency issues. We can either drop this > function, so that is leads to an error at link time, or add a call to > BUG for a runtime error. > > Right now we aren't using any driver that require dma_alloc_coherent, > but we use drivers that requires dma_alloc and dma_map_single instead. I would vote for a BUILD_BUG_ON_MSG in this function, so you get a compile time error and you can state what needs to be done in order to get rid of the failure. > > > > +/* > > > > + * The implementation of arch should follow the following rules: > > > > + * map for_cpu for_device unmap > > > > + * TO_DEV writeback none writeback none > > > > + * FROM_DEV invalidate invalidate(*) invalidate invalidate(*) > > > > + * BIDIR writeback invalidate writeback invalidate > > > > + * > > > > + * (*) - only necessary if the CPU speculatively prefetches. > > > > + * > > > > + * (see https://lkml.org/lkml/2018/5/18/979) > > > > + */ > > > > + > > > > +void dma_sync_single_for_device(dma_addr_t addr, size_t size, > > > > + enum dma_data_direction dir) > > > > +{ > > > > + switch (dir) { > > > > + case DMA_FROM_DEVICE: > > > > + kvx_dcache_invalidate_mem_area(addr, size); > > > > Why do you need to explicitly invalidate, but not flush? Even if the > > CPU speculatively prefetches, the coherency protocol should make sure > > to invalidate the speculatively loaded lines, right? > Since we don't have a coherent memory, here we need to invalidate L1 > dcache to let the CPU see deivce's writes in memory. > Also every write goes through the cache, flush is not required. Ah, if all your caches are write-through that makes sense. Can you add a comment somewhere stating that this implementation assumes WT caches on KVX? This way we can avoid the confusion Ahamd and myself fell into when glancing over the code. Regards, Lucas _______________________________________________ barebox mailing list barebox@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/barebox