RE: [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The isolate path of non-lru movable pages:

isolate_migratepages_block
	isolate_movable_page
		trylock_page
		// if PageIsolated, goto out_no_isolated
		a_ops->isolate_page
		__SetPageIsolated
		unlock_page
	list_add(&page->lru, &cc->migratepages)

The migration path:

unmap_and_move
	__unmap_and_move
		lock_page
		move_to_new_page
			a_ops->migratepage
			__ClearPageIsolated
		unlock_page
	/* here, the page could be isolated again by another thread, and added into another cc->migratepages,
	since PG_Isolated has been cleared, and not protected by page_lock */
	list_del(&page->lru)

Suppose thread A isolates three pages in the order p1, p2, p3, A's cc->migratepages will be like
	head_A - p3 - p2 - p1
After p2 is migrated (but before list_del), it is isolated by another thread B. Then list_del will delete p2
from the cc->migratepages of B (instead of A). When A continues to migrate and delete p1, it will find:
	p1->prev == p2
	p2->next == LIST_POISON1. 

So we will end up with a bug like
"list_del corruption. prev->next should be ffffffbf0a1eb8e0, but was dead000000000100"
(see __list_del_entry_valid).


________________________________________
From: Michal Hocko [mhocko@xxxxxxxxxx]
Sent: Tuesday, September 03, 2019 21:17
To: sunqiuyang
Cc: linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx
Subject: Re: [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages

On Tue 03-09-19 16:27:46, sunqiuyang wrote:
> From: Qiuyang Sun <sunqiuyang@xxxxxxxxxx>
>
> Currently, after a page is migrated, it
> 1) has its PG_isolated flag cleared in move_to_new_page(), and
> 2) is deleted from its LRU list (cc->migratepages) in unmap_and_move().
> However, between steps 1) and 2), the page could be isolated by another
> thread in isolate_movable_page(), and added to another LRU list, leading
> to list_del corruption later.

Care to explain the race? Both paths use page_lock AFAICS
>
> This patch fixes the bug by moving list_del into the critical section
> protected by lock_page(), so that a page will not be isolated again before
> it has been deleted from its LRU list.
>
> Signed-off-by: Qiuyang Sun <sunqiuyang@xxxxxxxxxx>
> ---
>  mm/migrate.c | 11 +++--------
>  1 file changed, 3 insertions(+), 8 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index a42858d..c58a606 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1124,6 +1124,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
>       /* Drop an anon_vma reference if we took one */
>       if (anon_vma)
>               put_anon_vma(anon_vma);
> +     if (rc != -EAGAIN)
> +             list_del(&page->lru);
>       unlock_page(page);
>  out:
>       /*
> @@ -1190,6 +1192,7 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
>                       put_new_page(newpage, private);
>               else
>                       put_page(newpage);
> +             list_del(&page->lru);
>               goto out;
>       }
>
> @@ -1200,14 +1203,6 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
>  out:
>       if (rc != -EAGAIN) {
>               /*
> -              * A page that has been migrated has all references
> -              * removed and will be freed. A page that has not been
> -              * migrated will have kepts its references and be
> -              * restored.
> -              */
> -             list_del(&page->lru);
> -
> -             /*
>                * Compaction can migrate also non-LRU pages which are
>                * not accounted to NR_ISOLATED_*. They can be recognized
>                * as __PageMovable
> --
> 1.8.3.1

--
Michal Hocko
SUSE Labs





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux