On Thu, May 20, 2021 at 5:03 PM Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx> wrote: > > On 5/21/21 8:10 AM, Linus Torvalds wrote: > > > > So mremap does need to flush the TLB before releasing the page table > > lock, because that's the lifetime boundary for the page that got > > moved. > > How will we avoid that happening with > c49dd340180260c6239e453263a9a244da9a7c85 / > 2c91bd4a4e2e530582d6fd643ea7b86b27907151 . The commit improves mremap > performance by moving level3/level2 page table entries. When doing so we > are not holding level 4 ptl lock (pte_lock()). But rather we are holding > pmd_lock or pud_lock(). So if we move pages around without holding the > pte lock, won't the above issue happen even if we do a tlb flush with > holding pmd lock/pud lock? Hmm. Interesting. Your patch (to flush the TLB after clearing the old location, and before inserting it into the new one) looks like an "obvious" fix. But I'm putting that "obvious" in quotes, because I'm now wondering if it actually fixes anything. Lookie here: - CPU1 does a mremap of a pmd or pud. It clears the old pmd/pud, flushes the old TLB range, and then inserts the pmd/pud at the new location. - CPU2 does a page shrinker, which calls try_to_unmap, which calls try_to_unmap_one. These are entirely asynchronous, because they have no shared lock. The mremap uses the pmd lock, the try_to_unmap_one() does the rmap walk, which does the pte lock. Now, imagine that the following ordering happens with the two operations above, and a CPU3 that does accesses: - CPU2 follows (and sees) the old page tables in the old location and the took the pte lock - the mremap on CPU1 starts - cleared the old pmd, flushed the tlb, *and* inserts in the new place. - a user thread on CPU3 accesses the new location and fills the TLB of the *new* address - only now does CPU2 get to the "pte_get_and_clear()" to remove one page - CPU2 does a TLB flush and frees the page End result: - both CPU1 _and_ CPU2 have flushed the TLB. - but both flushed the *OLD* address - the page is freed - CPU3 still has the stale TLB entry pointing to the page that is now free and might be reused for something else Am I missing something? Linus