Re: hunting an IO hang

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 17, 2011 at 11:47:46PM +0900, Minchan Kim wrote:
> On Mon, Jan 17, 2011 at 03:26:15PM +0100, Andrea Arcangeli wrote:
> > On Mon, Jan 17, 2011 at 09:10:15AM -0500, Chris Mason wrote:
> > > Excerpts from Andrea Arcangeli's message of 2011-01-17 00:11:35 -0500:
> > > 
> > > [ crashes under load ]
> > > 
> > > > 
> > > > NOTE: with the last changes compaction is used for all order > 0 and
> > > > even from kswapd, so you will now be able to trigger bugs in
> > > > compaction or migration even with THP off. However I'm surprised that
> > > > you have issues with compaction...
> > > 
> > > I know I mentioned this in another email, but it is kind of buried in
> > > other context.  I reproduced my crash with CONFIG_COMPACTION and
> > > CONFIG_MIGRATION off.
> > 
> > Ok, then it was an accident the page->lru got corrupted during
> > migration and it has nothing to do with migration/compaction/thp. This
> > makes sense because we should have noticed long ago if something
> > wasn't stable there.
> > 
> > I reworked the fix for the two memleaks I found while reviewing
> > migration code for this bug (unrelated) introduced by the commit
> > cf608ac19c95804dc2df43b1f4f9e068aa9034ab. It was enough to move the
> > goto to fix this without having to add a new function (it's
> > functionally identical to the one I sent before). It also wouldn't
> > leak memory if it was compaction invoking migrate_pages (only other
> > callers checking the retval of migrate_pages instead of list_empty,
> > could leak memory). As said before, this couldn't explain your
> > problem, and this is only a code review fix, I never triggered this.
> > 
> > This is still only for review for Minchan, not meant for inclusion
> > yet.
> > 
> > ===
> > Subject: when migrate_pages returns 0, all pages must have been released
> > 
> > From: Andrea Arcangeli <aarcange@xxxxxxxxxx>
> > 
> > In some cases migrate_pages could return zero while still leaving a
> > few pages in the pagelist (and some caller wouldn't notice it has to
> > call putback_lru_pages after commit
> > cf608ac19c95804dc2df43b1f4f9e068aa9034ab).
> > 
> > Add one missing putback_lru_pages not added by commit
> > cf608ac19c95804dc2df43b1f4f9e068aa9034ab.
> 
> It would be better to have another patch.
> 
> > 
> > Signed-off-by: Andrea Arcangeli <aarcange@xxxxxxxxxx>
> Reviewed-by: Minchan Kim <minchan.kim@xxxxxxxxx>

And don't we need this patch, either?

>From c14ec902f746da5c56f8b2e9446a7164a8831f6d Mon Sep 17 00:00:00 2001
From: Minchan Kim <minchan.kim@xxxxxxxxx>
Date: Tue, 18 Jan 2011 00:00:24 +0900
Subject: [PATCH] migration: Fix page corruption

If migrate_huge_page fails, it call put_page in itself to decrease
page reference and caller of migrate_huge_page also calls
putback_lru_pages. It can do double free of page so can make page
corruption on page holder.

In addtion, clean of pages on caller is consistent behavior with
migrate_pages.

Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx>
Signed-off-by: Minchan Kim <minchan.kim@xxxxxxxxx>
---
 mm/memory-failure.c |    4 +++-
 mm/migrate.c        |    4 ----
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 75398b0..c0752e1 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1273,6 +1273,7 @@ static int get_any_page(struct page *p, unsigned long pfn, int flags)
 static int soft_offline_huge_page(struct page *page, int flags)
 {
 	int ret;
+	struct page *page1, *page2;
 	unsigned long pfn = page_to_pfn(page);
 	struct page *hpage = compound_head(page);
 	LIST_HEAD(pagelist);
@@ -1295,7 +1296,8 @@ static int soft_offline_huge_page(struct page *page, int flags)
 	ret = migrate_huge_pages(&pagelist, new_page, MPOL_MF_MOVE_ALL, 0,
 				true);
 	if (ret) {
-		putback_lru_pages(&pagelist);
+		list_for_each_entry_safe(page1, page2, &pagelist, lru)
+			put_page(page1);
 		pr_debug("soft offline: %#lx: migration failed %d, type %lx\n",
 			 pfn, ret, page->flags);
 		if (ret > 0)
diff --git a/mm/migrate.c b/mm/migrate.c
index 7d34237..3a6d4fd 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -980,10 +980,6 @@ int migrate_huge_pages(struct list_head *from,
 	}
 	rc = 0;
 out:
-
-	list_for_each_entry_safe(page, page2, from, lru)
-		put_page(page);
-
 	if (rc)
 		return rc;
 
-- 
1.7.0.4


> 
> Thanks, Andrea.
> 
> -- 
> Kind regards,
> Minchan Kim

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]