Btw, this seems to be causing iommu faults for me for what (according to a sw pgtable walk) should be a valid mapping, indicating missing/incomplete tlb invalidation. This is with drm/msm (which probably matters, since it implements it's own iommu_flush_ops) on x1e80100 (which probably doesn't matter.. but it is an mmu-500 in case it does). I _think_ what is causing this is the change in ordering of __arm_lpae_clear_pte() (dma_sync_single_for_device() on the pgtable memory) and io_pgtable_tlb_flush_walk(). I'm not entirely sure how this patch is supposed to work correctly in the face of other concurrent translations (to buffers unrelated to the one being unmapped(), because after the io_pgtable_tlb_flush_walk() we can have stale data read back into the tlb. How is this supposed to work? BR, -R