Jason Gunthorpe <jgg@xxxxxxxxxx> writes: > On Wed, May 31, 2023 at 12:46:06PM +1000, Alistair Popple wrote: >> >> Jason Gunthorpe <jgg@xxxxxxxxxx> writes: >> >> > On Wed, May 31, 2023 at 10:30:48AM +1000, Alistair Popple wrote: >> > >> >> So I'd rather keep the invalidate in ptep_set_access_flags(). Would >> >> renaming invalidate_range() to invalidate_arch_secondary_tlb() along >> >> with clearing up the documentation make that more acceptable, at least >> >> in the short term? >> > >> > Then we need to go through removing kvm first I think. >> >> Why? I don't think we need to hold up a fix for something that is an >> issue today so we can rework a fix for an unrelated problem. > > I'm nervous about affecting KVM's weird usage if we go in and start > making changes. Getting rid of it first is much safer Fair enough. In this case though I think we're safe because we won't be affecting KVM's usage of it - my change only affects ARM64 and KVM only really uses this on x86 via the arch-specific kvm_arch_mmu_notifier_invalidate_range() definition. >> > Yeah, I think I would call it invalidate_arch_secondary_tlb() and >> > document it as being an arch specific set of invalidations that match >> > the architected TLB maintenance requrements. And maybe we can check it >> > more carefully to make it be called in less places. Like I'm not sure >> > it is right to call it from invalidate_range_end under this new >> > definition.. >> >> I will look at this in more depth, but this comment reminded me there is >> already an issue with calling .invalidate_range() from >> invalidate_range_end(). We have seen slow downs when unmapping unused >> ranges because unmap_vmas() will call .invalidate_range() via >> .invalidate_range_end() flooding the SMMU with invalidates even though >> zap_pte_range() skipped it because the PTEs were pte_none. > > Yes, if the new API is specifically about synchronizing an architected > TLB then really the call to the op should be done near the > architectures TLB flush points, and not higher in the MM. > > ie any flush of the CPU tlb should mirror 1:1 to a flush of the IOMMU > TLB, no broadinging or narrowing. > > It is a very clean defintion and we can leap directly to it once we > get kvm out of the way. Yes, no argument there. > Jason