Re: hunting an IO hang

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 17, 2011 at 09:10:15AM -0500, Chris Mason wrote:
> Excerpts from Andrea Arcangeli's message of 2011-01-17 00:11:35 -0500:
> 
> [ crashes under load ]
> 
> > 
> > NOTE: with the last changes compaction is used for all order > 0 and
> > even from kswapd, so you will now be able to trigger bugs in
> > compaction or migration even with THP off. However I'm surprised that
> > you have issues with compaction...
> 
> I know I mentioned this in another email, but it is kind of buried in
> other context.  I reproduced my crash with CONFIG_COMPACTION and
> CONFIG_MIGRATION off.

Ok, then it was an accident the page->lru got corrupted during
migration and it has nothing to do with migration/compaction/thp. This
makes sense because we should have noticed long ago if something
wasn't stable there.

I reworked the fix for the two memleaks I found while reviewing
migration code for this bug (unrelated) introduced by the commit
cf608ac19c95804dc2df43b1f4f9e068aa9034ab. It was enough to move the
goto to fix this without having to add a new function (it's
functionally identical to the one I sent before). It also wouldn't
leak memory if it was compaction invoking migrate_pages (only other
callers checking the retval of migrate_pages instead of list_empty,
could leak memory). As said before, this couldn't explain your
problem, and this is only a code review fix, I never triggered this.

This is still only for review for Minchan, not meant for inclusion
yet.

===
Subject: when migrate_pages returns 0, all pages must have been released

From: Andrea Arcangeli <aarcange@xxxxxxxxxx>

In some cases migrate_pages could return zero while still leaving a
few pages in the pagelist (and some caller wouldn't notice it has to
call putback_lru_pages after commit
cf608ac19c95804dc2df43b1f4f9e068aa9034ab).

Add one missing putback_lru_pages not added by commit
cf608ac19c95804dc2df43b1f4f9e068aa9034ab.

Signed-off-by: Andrea Arcangeli <aarcange@xxxxxxxxxx>
---

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 548fbd7..75398b0 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1419,6 +1419,7 @@ int soft_offline_page(struct page *page, int flags)
 		ret = migrate_pages(&pagelist, new_page, MPOL_MF_MOVE_ALL,
 								0, true);
 		if (ret) {
+			putback_lru_pages(&pagelist);
 			pr_info("soft offline: %#lx: migration failed %d, type %lx\n",
 				pfn, ret, page->flags);
 			if (ret > 0)
diff --git a/mm/migrate.c b/mm/migrate.c
index 46fe8cc..7d34237 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -772,6 +772,7 @@ uncharge:
 unlock:
 	unlock_page(page);
 
+move_newpage:
 	if (rc != -EAGAIN) {
  		/*
  		 * A page that has been migrated has all references
@@ -785,8 +786,6 @@ unlock:
 		putback_lru_page(page);
 	}
 
-move_newpage:
-
 	/*
 	 * Move the new page to the LRU. If migration was not successful
 	 * then this will free the page.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]