On Fri, 2019-03-29 at 12:59 -0700, Matthew Wilcox wrote: > I don't understand how we get to this situation. We SetPageSwapCache() > in add_to_swap_cache() right before we store the page in i_pages. > We ClearPageSwapCache() in __delete_from_swap_cache() right after > removing the page from the array. So how do we find a page in a swap > address space that has PageSwapCache cleared? > > Indeed, we have a check which should trigger ... > > VM_BUG_ON_PAGE(!PageSwapCache(page), page); > > in __delete_from_swap_cache(). > > Oh ... is it a race? > > * Its ok to check for PageSwapCache without the page lock > * here because we are going to recheck again inside > * try_to_free_swap() _with_ the lock. > > so CPU A does: > > page = find_get_page(swap_address_space(entry), offset) > page = find_subpage(page, offset); > trylock_page(page); > > while CPU B does: > > xa_lock_irq(&address_space->i_pages); > __delete_from_swap_cache(page, entry); > xas_store(&xas, NULL); > ClearPageSwapCache(page); > xa_unlock_irq(&address_space->i_pages); > > and if the ClearPageSwapCache happens between the xas_load() and the > find_subpage(), we're stuffed. CPU A has a reference to the page, but > not a lock, and find_get_page is running under RCU. > > I suppose we could fix this by taking the i_pages xa_lock around the > call to find_get_pages(). If indeed, that's what this problem is. > Want to try this patch? Confirmed. Well spotted! > diff --git a/mm/swapfile.c b/mm/swapfile.c > index 2b8d9c3fbb47..ed8e42be88b5 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -127,10 +127,14 @@ static int __try_to_reclaim_swap(struct swap_info_struct > *si, > unsigned long offset, unsigned long flags) > { > swp_entry_t entry = swp_entry(si->type, offset); > + struct address_space *mapping = swap_address_space(entry); > + unsigned long irq_flags; > struct page *page; > int ret = 0; > > - page = find_get_page(swap_address_space(entry), offset); > + xa_lock_irqsave(&mapping->i_pages, irq_flags); > + page = find_get_page(mapping, offset); > + xa_unlock_irqrestore(&mapping->i_pages, irq_flags); > if (!page) > return 0; > /*