On Fri, Jan 20, 2023 at 06:04:37PM +0100, Arnd Bergmann wrote: > Having looked at this some more, I see that the powerpc > version is a bit problematic here as well: this one > flushes the partial cache lines before and after the > DMA transfer, while only invalidating the full > cache lines. That feels really odd, and might be worth a bug report to the PPC maintainers. > Obviously there is no winning either way if the same > cache line gets written by both CPU and device, I'm > just trying to figure out what behavior we actually > want here. There isn't, and that's why we require DMAed regions to be cache line aligned. > Aside from the question for how to handle flush vs invalidate > on DMA_FROM_DEVICE, I'm still trying to figure out how to > best handle highmem with architecture specific cache management > operations. The easy approach would be to leave that up > to the architecture, passing only a physical address to > the flush function. I suspect that is a good enough first step. Especially as I remember that some architectures have physical address based cache management anyway (unless we removed them in the meantime). > A nicer interface might be to move the > loop over highmem pages out into common code, flush > lowmem pages by virtual addresss, and have a separate > callback for highmem pages that takes a page pointer, > like I'd rather avoid multiple callbacks if we can. But maybe solve the simple problem first and just pass the paddr and then iterate from there. > > struct dma_cache_ops { > void (*dma_cache_wback_inv)(void *start, unsigned long sz); > void (*dma_cache_inv)(void *start, unsigned long sz); > void (*dma_cache_wback)(void *start, unsigned long sz); > #ifdef CONFIG_HIGHMEM > void (*dma_cache_wback_inv_high_page)(struct page *, size_t start, unsigned long sz); > void (*dma_cache_inv_high_page)(struct page *, size_t start, unsigned long sz); > void (*dma_cache_wback_high_page)(struct page *, size_t start, unsigned long sz); Btw, I really don't think these should be indirect calls. For sane architectures there should be exactly one way to call them, and the onces that have different implementations really should be using alternatives instead of expensive indirect calls.