On Tue, Nov 27, 2018 at 12:23:32PM -0800, Hugh Dickins wrote: > On Tue, 27 Nov 2018, Kirill A. Shutemov wrote: > > On Mon, Nov 26, 2018 at 03:27:52PM -0800, Hugh Dickins wrote: > > > Several cleanups in collapse_shmem(): most of which probably do not > > > really matter, beyond doing things in a more familiar and reassuring > > > order. Simplify the failure gotos in the main loop, and on success > > > update stats while interrupts still disabled from the last iteration. > > > > > > Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages") > > > Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx> > > > Cc: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> > > > Cc: stable@xxxxxxxxxxxxxxx # 4.8+ > > > --- > > > mm/khugepaged.c | 72 ++++++++++++++++++++++--------------------------- > > > 1 file changed, 32 insertions(+), 40 deletions(-) > > > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > > index 1c402d33547e..9d4e9ff1af95 100644 > > > --- a/mm/khugepaged.c > > > +++ b/mm/khugepaged.c > > > @@ -1329,10 +1329,10 @@ static void collapse_shmem(struct mm_struct *mm, > > > goto out; > > > } > > > > > > + __SetPageLocked(new_page); > > > + __SetPageSwapBacked(new_page); > > > new_page->index = start; > > > new_page->mapping = mapping; > > > - __SetPageSwapBacked(new_page); > > > - __SetPageLocked(new_page); > > > BUG_ON(!page_ref_freeze(new_page, 1)); > > > > > > /* > > > @@ -1366,13 +1366,13 @@ static void collapse_shmem(struct mm_struct *mm, > > > if (index == start) { > > > if (!xas_next_entry(&xas, end - 1)) { > > > result = SCAN_TRUNCATED; > > > - break; > > > + goto xa_locked; > > > } > > > xas_set(&xas, index); > > > } > > > if (!shmem_charge(mapping->host, 1)) { > > > result = SCAN_FAIL; > > > - break; > > > + goto xa_locked; > > > } > > > xas_store(&xas, new_page + (index % HPAGE_PMD_NR)); > > > nr_none++; > > > @@ -1387,13 +1387,12 @@ static void collapse_shmem(struct mm_struct *mm, > > > result = SCAN_FAIL; > > > goto xa_unlocked; > > > } > > > - xas_lock_irq(&xas); > > > - xas_set(&xas, index); > > > } else if (trylock_page(page)) { > > > get_page(page); > > > + xas_unlock_irq(&xas); > > > } else { > > > result = SCAN_PAGE_LOCK; > > > - break; > > > + goto xa_locked; > > > } > > > > > > /* > > > > I'm puzzled by locking change here. > > The locking change here is to not re-get xas_lock_irq (in shmem_getpage > case) just before we drop it anyway: you point out that it used to cover > /* > * The page must be locked, so we can drop the i_pages lock > * without racing with truncate. > */ > VM_BUG_ON_PAGE(!PageLocked(page), page); > VM_BUG_ON_PAGE(!PageUptodate(page), page); > VM_BUG_ON_PAGE(PageTransCompound(page), page); > if (page_mapping(page) != mapping) { > but now does not. > > But the comment you wrote there originally (ah, git blame shows > that Matthew has made it say i_pages lock instead of tree_lock), > "The page must be locked, so we can drop...", was correct all along, > I'm just following what it says. > > It would wrong if the trylock_page came after the xas_unlock_irq, > but it comes before (as before): holding i_pages lock across the > lookup makes sure we look up the right page (no RCU racing) and > trylock_page makes sure that it cannot be truncated or hole-punched > or migrated or whatever from that point on - so can drop i_pages lock. You are right. I confused myself. > Actually, I think we could VM_BUG_ON(page_mapping(page) != mapping), > couldn't we? Not that I propose to make such a change at this stage. Yeah it should be safe. We may put WARN there. > > Isn't the change responsible for the bug you are fixing in 09/10? > > In which I relaxed the VM_BUG_ON_PAGE(PageTransCompound(page), page) > that appears in the sequence above. > > Well, what I was thinking of in 9/10 was a THP being inserted at some > stage between selecting this range for collapse and reaching the last > (usually first) xas_lock_irq(&xas) in the "This will be less messy..." > loop above: I don't see any locking against that possibility. (And it > has to be that initial xas_lock_irq(&xas), because once the PageLocked > head of new_page is inserted in the i_pages tree, there is no more > chance for truncation and a competing THP to be inserted there.) > > So 9/10 would be required anyway; but you're thinking that the page > we looked up under i_pages lock and got trylock_page on, could then > become Compound once i_pages lock is dropped? I don't think so: pages > don't become Compound after they've left the page allocator, do they? > And if we ever manage to change that, I'm pretty sure it would be with > page locks held and page refcounts frozen. > > IIRC, my intend for the locking scheme was to protect against > > truncate-repopulate race. > > > > What do I miss? > > The stage in between selecting the range for collapse, and getting > the initial i_pages lock? Pages not becoming Compound underneath > you, with or without page lock, with or without i_pages lock? Page > lock being sufficient protection against truncation and migration? Agreed on all fronts. Sorry for the noise. > > The rest of the patch *looks* okay, but I found it hard to follow. > > Splitting it up would make it easier. > > It needs some time, I admit: thanks a lot for persisting with it. > And thanks (to you and to Matthew) for the speedy Acks elsewhere. > > Hugh -- Kirill A. Shutemov