Re: [PATCH] mm: compaction: Check pfn_valid when entering a new MAX_ORDER_NR_PAGES block during isolation for migration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 31, 2012 at 12:40:26PM -0800, Andrew Morton wrote:
> On Tue, 31 Jan 2012 16:35:28 +0000
> Mel Gorman <mgorman@xxxxxxx> wrote:
> 
> > When isolating for migration, migration starts at the start of a zone
> > which is not necessarily pageblock aligned. Further, it stops isolating
> > when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally
> > not aligned.
> > 
> > The problem is that pfn_valid is only called on the first PFN being
> > checked. Lets say we have a case like this
> > 
> > H = MAX_ORDER_NR_PAGES boundary
> > | = pageblock boundary
> > m = cc->migrate_pfn
> > f = cc->free_pfn
> > o = memory hole
> > 
> > H------|------H------|----m-Hoooooo|ooooooH-f----|------H
> > 
> > The migrate_pfn is just below a memory hole and the free scanner is
> > beyond the hole. When isolate_migratepages started, it scans from
> > migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory
> > hole. It checks pfn_valid() on the first PFN but then scans into the
> > hole where there are not necessarily valid struct pages.
> > 
> > This patch ensures that isolate_migratepages calls pfn_valid when
> > necessary.
> > 
> > Reported-and-tested-by: Herbert van den Bergh <herbert.van.den.bergh@xxxxxxxxxx>
> > Signed-off-by: Mel Gorman <mgorman@xxxxxxx>
> > Acked-by: Michal Nazarewicz <mina86@xxxxxxxxxx>
> 
> The changelog forgot to describe the user-visible effects of the bug.
> 

My bad. This changelog originally had a bugzilla number which I cut
out when reposting. Here is the same patch with a better changelog

=== CUT HERE ===
mm: compaction: Check pfn_valid when entering a new MAX_ORDER_NR_PAGES block during isolation for migration

When isolating for migration, migration starts at the start of a
zone which is not necessarily pageblock aligned. Further, it stops
isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn
is generally not aligned. This allows isolate_migratepages() to call
pfn_to_page() on an invalid PFN which can result in a crash. This
was originally reported against a 3.0-based kernel with the following
trace in a crash dump.

PID: 9902   TASK: d47aecd0  CPU: 0   COMMAND: "memcg_process_s"
 #0 [d72d3ad0] crash_kexec at c028cfdb
 #1 [d72d3b24] oops_end at c05c5322
 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60
 #3 [d72d3bec] bad_area at c0227fb6
 #4 [d72d3c00] do_page_fault at c05c72ec
 #5 [d72d3c80] error_code (via page_fault) at c05c47a4
    EAX: 00000000  EBX: 000c0000  ECX: 00000001  EDX: 00000807  EBP: 000c0000 
    DS:  007b      ESI: 00000001  ES:  007b      EDI: f3000a80  GS:  6f50
    CS:  0060      EIP: c030b15a  ERR: ffffffff  EFLAGS: 00010002 
 #6 [d72d3cb4] isolate_migratepages at c030b15a
 #7 [d72d3d14] zone_watermark_ok at c02d26cb
 #8 [d72d3d2c] compact_zone at c030b8de
 #9 [d72d3d68] compact_zone_order at c030bba1
#10 [d72d3db4] try_to_compact_pages at c030bc84
#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7
#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7
#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97
#14 [d72d3eb8] alloc_pages_vma at c030a845
#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb
#16 [d72d3f00] handle_mm_fault at c02f36c6
#17 [d72d3f30] do_page_fault at c05c70ed
#18 [d72d3fb0] error_code (via page_fault) at c05c47a4
    EAX: b71ff000  EBX: 00000001  ECX: 00001600  EDX: 00000431 
    DS:  007b      ESI: 08048950  ES:  007b      EDI: bfaa3788
    SS:  007b      ESP: bfaa36e0  EBP: bfaa3828  GS:  6f50
    CS:  0073      EIP: 080487c8  ERR: ffffffff  EFLAGS: 00010202 

It was also reported by Herbert van den Bergh against 3.1-based kernel
with the following snippit from the console log.

BUG: unable to handle kernel paging request at 01c00008
IP: [<c0522399>] isolate_migratepages+0x119/0x390
*pdpt = 000000002f7ce001 *pde = 0000000000000000

It is expected that it also affects 3.2.x and current mainline.

The problem is that pfn_valid is only called on the first PFN being
checked and that PFN is not necessarily aligned. Lets say we have a
case like this

H = MAX_ORDER_NR_PAGES boundary
| = pageblock boundary
m = cc->migrate_pfn
f = cc->free_pfn
o = memory hole

H------|------H------|----m-Hoooooo|ooooooH-f----|------H

The migrate_pfn is just below a memory hole and the free scanner is
beyond the hole. When isolate_migratepages started, it scans from
migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory
hole. It checks pfn_valid() on the first PFN but then scans into the
hole where there are not necessarily valid struct pages.

This patch ensures that isolate_migratepages calls pfn_valid when
necessary.

Reported-and-tested-by: Herbert van den Bergh <herbert.van.den.bergh@xxxxxxxxxx>
Signed-off-by: Mel Gorman <mgorman@xxxxxxx>
Acked-by: Michal Nazarewicz <mina86@xxxxxxxxxx>
Cc: stable <stable@xxxxxxxxxxxxxxx>
---
 mm/compaction.c |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 899d956..edc1e26 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -313,6 +313,19 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
 		} else if (!locked)
 			spin_lock_irq(&zone->lru_lock);
 
+		/*
+		 * migrate_pfn does not necessarily start aligned to a
+		 * pageblock. Ensure that pfn_valid is called when moving
+		 * into a new MAX_ORDER_NR_PAGES range in case of large
+		 * memory holes within the zone
+		 */
+		if ((low_pfn & (MAX_ORDER_NR_PAGES - 1)) == 0) {
+			if (!pfn_valid(low_pfn)) {
+				low_pfn += MAX_ORDER_NR_PAGES - 1;
+				continue;
+			}
+		}
+
 		if (!pfn_valid_within(low_pfn))
 			continue;
 		nr_scanned++;

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]