Re: [PATCH v3] mmu_gather: move tlb flush for VM_PFNMAP/VM_MIXEDMAP vmas into free_pgtables()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 23 Jan 2025, Hugh Dickins wrote:
> On Thu, 23 Jan 2025, Roman Gushchin wrote:
> 
> > Commit b67fbebd4cf9 ("mmu_gather: Force tlb-flush VM_PFNMAP vmas")
> > added a forced tlbflush to tlb_vma_end(), which is required to avoid a
> > race between munmap() and unmap_mapping_range(). However it added some
> > overhead to other paths where tlb_vma_end() is used, but vmas are not
> > removed, e.g. madvise(MADV_DONTNEED).
> > 
> > Fix this by moving the tlb flush out of tlb_end_vma() into
> > free_pgtables(), somewhat similar to the stable version of the
> > original commit: e.g. stable commit 895428ee124a ("mm: Force TLB flush
> > for PFNMAP mappings before unlink_file_vma()").
> > 
> > Note, that if tlb->fullmm is set, no flush is required, as the whole
> > mm is about to be destroyed.
> > 
> > ---
> 
> As Liam said, thanks.
> 
> > 
> > v3:
> >   - added initialization of vma_pfn in __tlb_reset_range() (by Hugh D.)
> > 
> > v2:
> >   - moved vma_pfn flag handling into tlb.h (by Peter Z.)
> >   - added comments (by Peter Z.)
> >   - fixed the vma_pfn flag setting (by Hugh D.)
> > 
> > Suggested-by: Jann Horn <jannh@xxxxxxxxxx>
> > Signed-off-by: Roman Gushchin <roman.gushchin@xxxxxxxxx>
> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > Cc: Will Deacon <will@xxxxxxxxxx>
> > Cc: "Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxx>
> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> > Cc: Nick Piggin <npiggin@xxxxxxxxx>
> > Cc: Hugh Dickins <hughd@xxxxxxxxxx>
> > Cc: linux-arch@xxxxxxxxxxxxxxx
> > Cc: linux-mm@xxxxxxxxx
> > ---
> >  include/asm-generic/tlb.h | 49 ++++++++++++++++++++++++++++-----------
> >  mm/memory.c               |  7 ++++++
> >  2 files changed, 42 insertions(+), 14 deletions(-)
> 

I had quite a wobble on Friday, couldn't be sure of anything at all.
But I've now spent longer, quietly thinking about this (v3) patch,
and the various races; and with Jann's help, I do feel much more
confident about it all today.

> The code looks right to me now, but not the comments (I usually
> prefer no comment to a wrong or difficult to get right comment).

Yes, the code does look right to me now.  And although I can still quibble
about the comments, I'd better not waste your time with that.  Let me say

Acked-by: Hugh Dickins <hughd@xxxxxxxxxx>

while recognizing that this may not be the patch which goes into
the tree, since Peter has other ideas on the naming and wording.

> 
> Except when I try to write a simple enough correct comment,
> I find the code has to be changed, and that then suggests
> further changes... Sigh.
> 
> (We could also go down a path of saying that all of the vma_pfn stuff
> would be better under #fidef CONFIG_MMU_GATHER_MERGE_VMAS; but I
> think we shall only confuse ourselves that way - it shouldn't be
> enough to matter, so long as it does not add any extra TLB flushes.)
> 
> > 
> > diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
> > index 709830274b75..cdc95b69b91d 100644
> > --- a/include/asm-generic/tlb.h
> > +++ b/include/asm-generic/tlb.h
> > @@ -380,8 +380,16 @@ static inline void __tlb_reset_range(struct mmu_gather *tlb)
> >  	tlb->cleared_pmds = 0;
> >  	tlb->cleared_puds = 0;
> >  	tlb->cleared_p4ds = 0;
> > +
> > +	/*
> > +	 * vma_pfn can only be set in tlb_start_vma(), so let's
> > +	 * initialize it here. Also a tlb flush issued by
> > +	 * tlb_flush_mmu_pfnmap() will cancel the vma_pfn state,
> > +	 * so that unnecessary subsequent flushes are avoided.
> 
> No, that misses the point (or misses half of the point): the
> tlb_flush_mmu_pfnmap() itself won't need to flush if for other reasons
> we've done a TLB flush since the last VM_PFNMAP|VM_MIXEDMAP vma.
> 
> What I want to write is:
> 	 * vma_pfn can only be set in tlb_start_vma(), so initialize it here.
> 	 * And then any call to tlb_flush_mmu_tlbonly() will reset it again,
> 	 * so that unnecessary subsequent flushes are avoided.
> 
> But once I look at tlb_flush_mmu_tlbonly(), I'm reminded that actually
> it does nothing, if none of cleared_ptes etc. is set: so would not reset
> vma_pfn in that case; which is okay-ish, but makes writing the comment hard.
> 
> So maybe tlb_flush_mmu_tlbonly() should do an explicit "tlb->vma_pfn = 0"
> before returning early; but that then raises the question of whether it
> would not be better just to initialize vma_pfn to 0 in __tlb_gather_mmu(),
> not touch it in __tlb_reset_range(), but reset it to 0 at the start of
> tlb_flush_mmu_tlbonly().
> 
> But it also makes me realize that tlb_flush_mmu_tlbonly() avoiding
> __tlb_reset_range() when nothing was cleared, is not all that good:
> because if flushing a small range is better than flushing a large range,
> then it would be good to reset the range even when nothing was cleared
> (though it looks stupid to reset all the fields just found 0 already).

My paragraph (about the existing code, independent of your patch) looks
nonsense to me now: if there was nothing to be cleared, then the range
would not have been updated, so would not benefit from being reset.

It's still true that there would sometimes be an optimization in setting
"tlb->vma_pfn = 0" in every tlb_flush_mmu_tlbonly(); but it's merely an
optimization, for an unusual case, which you may find demands yet more
thought than it deserves; my guess is that you will prefer not to add
that change, and that's fine by me.

So, if you did respin and change the comment (but I don't insist), maybe:
	 * vma_pfn can only be set in tlb_start_vma(), so initialize it here.
	 * And then it will be reset again after any call to tlb_flush(),
	 * so that unnecessary subsequent flushes are avoided.

Hugh




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux