The isolate path of non-lru movable pages: isolate_migratepages_block isolate_movable_page trylock_page // if PageIsolated, goto out_no_isolated a_ops->isolate_page __SetPageIsolated unlock_page list_add(&page->lru, &cc->migratepages) The migration path: unmap_and_move __unmap_and_move lock_page move_to_new_page a_ops->migratepage __ClearPageIsolated unlock_page /* here, the page could be isolated again by another thread, and added into another cc->migratepages, since PG_Isolated has been cleared, and not protected by page_lock */ list_del(&page->lru) Suppose thread A isolates three pages in the order p1, p2, p3, A's cc->migratepages will be like head_A - p3 - p2 - p1 After p2 is migrated (but before list_del), it is isolated by another thread B. Then list_del will delete p2 from the cc->migratepages of B (instead of A). When A continues to migrate and delete p1, it will find: p1->prev == p2 p2->next == LIST_POISON1. So we will end up with a bug like "list_del corruption. prev->next should be ffffffbf0a1eb8e0, but was dead000000000100" (see __list_del_entry_valid). ________________________________________ From: Michal Hocko [mhocko@xxxxxxxxxx] Sent: Tuesday, September 03, 2019 21:17 To: sunqiuyang Cc: linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx Subject: Re: [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages On Tue 03-09-19 16:27:46, sunqiuyang wrote: > From: Qiuyang Sun <sunqiuyang@xxxxxxxxxx> > > Currently, after a page is migrated, it > 1) has its PG_isolated flag cleared in move_to_new_page(), and > 2) is deleted from its LRU list (cc->migratepages) in unmap_and_move(). > However, between steps 1) and 2), the page could be isolated by another > thread in isolate_movable_page(), and added to another LRU list, leading > to list_del corruption later. Care to explain the race? Both paths use page_lock AFAICS > > This patch fixes the bug by moving list_del into the critical section > protected by lock_page(), so that a page will not be isolated again before > it has been deleted from its LRU list. > > Signed-off-by: Qiuyang Sun <sunqiuyang@xxxxxxxxxx> > --- > mm/migrate.c | 11 +++-------- > 1 file changed, 3 insertions(+), 8 deletions(-) > > diff --git a/mm/migrate.c b/mm/migrate.c > index a42858d..c58a606 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -1124,6 +1124,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage, > /* Drop an anon_vma reference if we took one */ > if (anon_vma) > put_anon_vma(anon_vma); > + if (rc != -EAGAIN) > + list_del(&page->lru); > unlock_page(page); > out: > /* > @@ -1190,6 +1192,7 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page, > put_new_page(newpage, private); > else > put_page(newpage); > + list_del(&page->lru); > goto out; > } > > @@ -1200,14 +1203,6 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page, > out: > if (rc != -EAGAIN) { > /* > - * A page that has been migrated has all references > - * removed and will be freed. A page that has not been > - * migrated will have kepts its references and be > - * restored. > - */ > - list_del(&page->lru); > - > - /* > * Compaction can migrate also non-LRU pages which are > * not accounted to NR_ISOLATED_*. They can be recognized > * as __PageMovable > -- > 1.8.3.1 -- Michal Hocko SUSE Labs