Re: [PATCH v10 00/12] LUF(Lazy Unmap Flush) reducing tlb numbers over 90%

Byungchul Park <byungchul@xxxxxx> · Wed, 29 May 2024 14:00:46 +0900

On Tue, May 28, 2024 at 08:14:43AM -0700, Dave Hansen wrote:
> On 5/26/24 20:10, Huang, Ying wrote:
> >> Thank you for the pointing out.  I will fix it too by introducing a new
> >> flag in inode or something to make LUF aware if updating the file has
> >> been tried so that LUF can give up and flush right away in the case.
> >>
> >> Plus, I will add another give-up at code changing the permission of vma
> >> to writable.
> > I guess that you need a framework similar as
> > "flush_tlb_batched_pending()" to deal with interaction with other TLB
> > related operations.
> 
> Where "other TLB related operations" includes both things that
> traditionally invalidate TLBs (like going Present 1=>0) and things like
> fault-in that go Present 0=>1 that can result in TLB population.
> 
> It's actually a really crummy problem to solve.  We don't have _any_
> machinery to say, "Hey, you know that PTE you wanted to install?  There
> was something there before you and we haven't flushed it yet.  Can you
> be a doll and do a flush before _populating_ that PTE?"

All the code updating ptes already performs TLB flush needed in a safe
way if it's inevitable e.g. munmap.  LUF which controls when to flush in
a higer level than arch code, just leaves stale ro tlb entries that are
currently supposed to be in use.  Could you give a scenario that you are
concering?

	Byungchul

> To solve it generically, I suspect you'll need some kind of special
> non-present PTE to say:
> 
> 	There _was_ a PTE here that wasn't flushed.
> 
> Sure, you can add gunk to the VMA to track when this happens.  But
> that'll penalize anyone populating a PTE anywhere in the VMA at least
> once.  If there were other threads faulting in pages to the same VMA,
> they'll just end up doing the flush that LUF tried to avoid in the first
> place.