On Fri, Oct 18, 2024 at 02:19:26PM -0300, Jason Gunthorpe wrote: > Of the page table implementations (AMD v1/2, VT-D SS, ARM32, DART) > arm_lpae is unique in how it handles partial unmap of large IOPTEs. > > All other drivers will unmap the large IOPTE and return it's length. For > example if a 2M IOPTE is present and the first 4K is requested to be > unmapped then unmap will remove the whole 2M and report 2M as the result. > > arm_lpae instead replaces the IOPTE with a table of smaller IOPTEs, unmaps > the 4K and returns 4k. This is actually an illegal/non-hitless operation > on at least SMMUv3 because of the BBM level 0 rules. > > Long ago VFIO could trigger a path like this, today I know of no user of > this functionality. > > Given it doesn't work fully correctly on SMMUv3 and would create > portability problems if any user depends on it, remove the unique support > in arm_lpae and align with the expected iommu interface. > > Outside the iommu users, this will potentially effect io_pgtable users of > ARM_32_LPAE_S1, ARM_32_LPAE_S2, ARM_64_LPAE_S1, ARM_64_LPAE_S2, and > ARM_MALI_LPAE formats. > > Cc: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx> > Cc: Steven Price <steven.price@xxxxxxx> > Cc: Liviu Dudau <liviu.dudau@xxxxxxx> > Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx > Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx> > --- > drivers/iommu/io-pgtable-arm.c | 72 +++------------------------------- > 1 file changed, 6 insertions(+), 66 deletions(-) > > I don't know anything in the iommu space that needs this, and this is the only > page table implementation in iommu that does it. I think the v7s code does it as well, so please can you apply the same treatment to arm_v7s_split_blk_unmap()? > @@ -678,12 +618,12 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data, > > return i * size; > } else if (iopte_leaf(pte, lvl, iop->fmt)) { > - /* > - * Insert a table at the next level to map the old region, > - * minus the part we want to unmap > - */ > - return arm_lpae_split_blk_unmap(data, gather, iova, size, pte, > - lvl + 1, ptep, pgcount); > + /* Unmap the entire large IOPTE and return its size */ > + size = ARM_LPAE_BLOCK_SIZE(lvl, data); If I understand your other message correctly, we shouldn't actually get into this situation any more, right? In which case, can we WARN_ONCE() and return 0 instead? Over-unmapping is filthy! Will