On Wed, Sep 04, 2024 at 09:24:34AM -0700, Rob Clark wrote: > Btw, this seems to be causing iommu faults for me for what (according > to a sw pgtable walk) should be a valid mapping, indicating > missing/incomplete tlb invalidation. This is with drm/msm (which > probably matters, since it implements it's own iommu_flush_ops) on > x1e80100 (which probably doesn't matter.. but it is an mmu-500 in case > it does). > > I _think_ what is causing this is the change in ordering of > __arm_lpae_clear_pte() (dma_sync_single_for_device() on the pgtable > memory) and io_pgtable_tlb_flush_walk(). I'm not entirely sure how > this patch is supposed to work correctly in the face of other > concurrent translations (to buffers unrelated to the one being > unmapped(), because after the io_pgtable_tlb_flush_walk() we can have > stale data read back into the tlb. You mean this? if (!iopte_leaf(pte, lvl, iop->fmt)) { + __arm_lpae_clear_pte(&ptep[i], &iop->cfg, 1); + /* Also flush any partial walks */ io_pgtable_tlb_flush_walk(iop, iova + i * size, size, ARM_LPAE_GRANULE(data)); __arm_lpae_free_pgtable(data, lvl + 1, iopte_deref(pte, data)); I would say it should work because 1) The pte is cleared and cache flushed before the iotlb is cleared by the added __arm_lpae_clear_pte() 2) This is not a 'shared table' since it is fully covered by the size being unmapped. The caller must ensure there are no inersecting concurrent map/unmaps. 3) The double zeroing doesn't matter because of #2, no races are permitted. Jason