On Tue, Jul 15, 2014 at 5:38 PM, Arnd Bergmann <arnd@xxxxxxxx> wrote: > On Tuesday 15 July 2014 16:45:40 Ley Foon Tan wrote: > >> +static inline void __dma_sync(void *vaddr, size_t size, >> + enum dma_data_direction direction) >> +{ >> + switch (direction) { >> + case DMA_FROM_DEVICE: /* invalidate cache */ >> + invalidate_dcache_range((unsigned long)vaddr, >> + (unsigned long)(vaddr + size)); >> + break; >> + case DMA_TO_DEVICE: /* flush and invalidate cache */ >> + case DMA_BIDIRECTIONAL: >> + flush_dcache_range((unsigned long)vaddr, >> + (unsigned long)(vaddr + size)); >> + break; >> + default: >> + BUG(); >> + } >> +} > > This seems strange. More on that below. > >> +#define dma_alloc_noncoherent(d, s, h, f) dma_alloc_coherent(d, s, h, f) >> +#define dma_free_noncoherent(d, s, v, h) dma_free_coherent(d, s, v, h) >> + > ... >> +static inline void dma_cache_sync(struct device *dev, void *vaddr, size_t size, >> + enum dma_data_direction direction) >> +{ >> + __dma_sync(vaddr, size, direction); >> +} > > IIRC dma_cache_sync should be empty if you define dma_alloc_noncoherent > to be the same as dma_alloc_coherent: It's already coherent, so no sync > should be needed. What does the CPU do if you try to invalidate the cache > on a coherent mapping? Okay, I got what you mean here. I will leave this dma_cache_sync() function empty. The CPU just do nothing if we try to invalidate cache on a coherent region. BTW, I found many other architectures still provide dma_cache_sync() even they define dma_alloc_noncoherent same as dma_alloc_coherent. Eg: blackfin, x86 or xtense. > >> +void dma_sync_single_for_cpu(struct device *dev, dma_addr_t dma_handle, >> + size_t size, enum dma_data_direction direction) >> +{ >> + BUG_ON(!valid_dma_direction(direction)); >> + >> + __dma_sync(phys_to_virt(dma_handle), size, direction); >> +} >> +EXPORT_SYMBOL(dma_sync_single_for_cpu); >> + >> +void dma_sync_single_for_device(struct device *dev, dma_addr_t dma_handle, >> + size_t size, enum dma_data_direction direction) >> +{ >> + BUG_ON(!valid_dma_direction(direction)); >> + >> + __dma_sync(phys_to_virt(dma_handle), size, direction); >> +} >> +EXPORT_SYMBOL(dma_sync_single_for_device); > > More importantly: you do the same operation for both _for_cpu and _for_device. > I assume your CPU can never do speculative cache prefetches, so it's not > incorrect, but you do twice the number of invalidations and flushes that > you need. > > Why would you do anything for _for_cpu here? I am a bit confused for _for_cpu and _for_device here. I found some architectures like c6x and hexagon have same operation for both _for_cpu and _for_device as well. I have spent some times look at other architectures and below is what I found. Please correct me if I am wrong, especially for_device():DMA_FROM_DEVICE. _for_cpu(): case DMA_BIDIRECTIONAL: case DMA_FROM_DEVICE: /* invalidate cache */ break; case DMA_TO_DEVICE: /* do nothing */ break; ------------------------- _for_device(): case DMA_BIDIRECTIONAL: case DMA_TO_DEVICE: /* flush and invalidate cache */ break; case DMA_FROM_DEVICE: /* should we invalidate cache or do nothing? */ break; Thanks for review. Regards Ley Foon -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html