On Mon, Apr 9, 2012 at 9:13 AM, Steven J. Hill <sjhill@xxxxxxxx> wrote: > This reverts commit d759e6bf49a41edd66de7c6480b34cceb88ee29a. It may be clearer to say "refactors" instead of "reverts," as this patch reimplements the same feature in a different way. If the change was completely reverting the MIPS *_kernel_vmap_range functions, that would make me very uneasy as XFS will not run reliably on my systems without them. Also, the commit hash is d9cdc901af0f92da7f90c750d8c187f5500be067 in Linus' tree. > The > flush_kernel_vmap_range() and invalidate_kernel_vmap_range() are DMA > cache flushing operations and should be done via _dma_cache_wback_inv() > and _dma_cache_inv(). I think there is some ambiguity here. Documentation/cachetlb.txt says: "Since kernel I/O goes via physical pages, the I/O subsystem assumes that the user mapping and kernel offset mapping are the only aliases. This isn't true for vmap aliases, so anything in the kernel trying to do I/O to vmap areas must manually manage coherency. It must do this by flushing the vmap range before doing I/O and invalidating it after the I/O returns." "The design is to make this area safe to perform I/O on." This appears to support your statement. However, actual usage suggests that the *_kernel_vmap_range functions are used way up at the filesystem layer just to get the data out of the "wrong colored" D$ lines on systems with cache aliases: if (xfs_buf_is_vmapped(bp)) { flush_kernel_vmap_range(bp->b_addr, xfs_buf_vmap_len(bp)); } submit_bio(rw, bio); Then eventually, something like dma_map_sg() is called from a lower level driver (ala ata_sg_setup()), to perform the full L1+L2 cacheflush on each page involved in the transaction. I would also note that on ARM these operations turn into NOPs on VIPT non-aliasing CPUs, even if their caches are noncoherent: static inline void flush_kernel_vmap_range(void *addr, int size) { if ((cache_is_vivt() || cache_is_vipt_aliasing())) __cpuc_flush_dcache_area(addr, (size_t)size); } I suspect that reimplementing the *_kernel_vmap_range functions using _dma_cache_* would result in a double L2 flush on the same memory regions on systems with cache aliases, and an unnecessary L1+L2 flush on systems without aliases. I don't see any way that the *_kernel_vmap_range functions could be called in lieu of the dma_map_* functions, as they are missing critical elements such as "returning the mapped PA" and calls to the plat_* functions. So, any memory getting flushed through *_kernel_vmap_range will still need to use dma_map_* for DMA. Is there a reason why Ralf's original approach was not workable?