Patch "mm: fix classzone_idx underflow in shrink_zones()" has been added to the 4.4-stable tree

<gregkh@xxxxxxxxxxxxxxxxxxx> · Fri, 07 Jul 2017 11:10:27 +0200

This is a note to let you know that I've just added the patch titled

    mm: fix classzone_idx underflow in shrink_zones()

to the 4.4-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     mm-fix-classzone_idx-underflow-in-shrink_zones.patch
and it can be found in the queue-4.4 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.


>From vbabka@xxxxxxx  Fri Jul  7 11:06:31 2017
From: Vlastimil Babka <vbabka@xxxxxxx>
Date: Tue, 4 Jul 2017 10:45:43 +0200
Subject: mm: fix classzone_idx underflow in shrink_zones()
To: stable <stable@xxxxxxxxxxxxxxx>
Cc: Johannes Weiner <hannes@xxxxxxxxxxx>, Minchan Kim <minchan@xxxxxxxxxx>, Michal Hocko <mhocko@xxxxxxxxxx>, linux-mm <linux-mm@xxxxxxxxx>, LKML <linux-kernel@xxxxxxxxxxxxxxx>, Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
Message-ID: <cf25f1a5-5276-90ea-1eac-f2a2aceffaef@xxxxxxx>

From: Vlastimil Babka <vbabka@xxxxxxx>

[Not upstream as that would take 34+ patches]

We've got reported a BUG in do_try_to_free_pages():

BUG: unable to handle kernel paging request at ffff8ffffff28990
IP: [<ffffffff8119abe0>] do_try_to_free_pages+0x140/0x490
PGD 0
Oops: 0000 [#1] SMP
megaraid_sas sg scsi_mod efivarfs autofs4
Supported: No, Unsupported modules are loaded
Workqueue: kacpi_hotplug acpi_hotplug_work_fn
task: ffff88ffd0d4c540 ti: ffff88ffd0e48000 task.ti: ffff88ffd0e48000
RIP: 0010:[<ffffffff8119abe0>]  [<ffffffff8119abe0>] do_try_to_free_pages+0x140/0x490
RSP: 0018:ffff88ffd0e4ba60  EFLAGS: 00010206
RAX: 000006fffffff900 RBX: 00000000ffffffff RCX: ffff88fffff29000
RDX: 000000ffffffff00 RSI: 0000000000000003 RDI: 00000000024200c8
RBP: 0000000001320122 R08: 0000000000000000 R09: ffff88ffd0e4bbac
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88ffd0e4bae0
R13: 0000000000000e00 R14: ffff88fffff2a500 R15: ffff88fffff2b300
FS:  0000000000000000(0000) GS:ffff88ffe6440000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff8ffffff28990 CR3: 0000000001c0a000 CR4: 00000000003406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
 00000002db570a80 024200c80000001e ffff88fffff2b300 0000000000000000
 ffff88fffffd5700 ffff88ffd0d4c540 ffff88ffd0d4c540 ffffffff0000000c
 0000000000000000 0000000000000040 00000000024200c8 ffff88ffd0e4bae0
Call Trace:
 [<ffffffff8119afea>] try_to_free_pages+0xba/0x170
 [<ffffffff8118cf2f>] __alloc_pages_nodemask+0x53f/0xb20
 [<ffffffff811d39ff>] alloc_pages_current+0x7f/0x100
 [<ffffffff811e2232>] migrate_pages+0x202/0x710
 [<ffffffff815dadaa>] __offline_pages.constprop.23+0x4ba/0x790
 [<ffffffff81463263>] memory_subsys_offline+0x43/0x70
 [<ffffffff8144cbed>] device_offline+0x7d/0xa0
 [<ffffffff81392fa2>] acpi_bus_offline+0xa5/0xef
 [<ffffffff81394a77>] acpi_device_hotplug+0x21b/0x41f
 [<ffffffff8138dab7>] acpi_hotplug_work_fn+0x1a/0x23
 [<ffffffff81093cee>] process_one_work+0x14e/0x410
 [<ffffffff81094546>] worker_thread+0x116/0x490
 [<ffffffff810999ed>] kthread+0xbd/0xe0
 [<ffffffff815e4e7f>] ret_from_fork+0x3f/0x70

This translates to the loop in shrink_zone():

classzone_idx = requested_highidx;
while (!populated_zone(zone->zone_pgdat->node_zones +
					classzone_idx))
	classzone_idx--;

where no zone is populated, so classzone_idx becomes -1 (in RBX).

Added debugging output reveals that we enter the function with
sc->gfp_mask == GFP_NOFS|__GFP_NOFAIL|__GFP_HARDWALL|__GFP_MOVABLE
requested_highidx = gfp_zone(sc->gfp_mask) == 2 (ZONE_NORMAL)

Inside the for loop, however:
gfp_zone(sc->gfp_mask) == 3 (ZONE_MOVABLE)

This means we have gone through this branch:

if (buffer_heads_over_limit)
    sc->gfp_mask |= __GFP_HIGHMEM;

This changes the gfp_zone() result, but requested_highidx remains unchanged.
On nodes where the only populated zone is movable, the inner while loop will
check only lower zones, which are not populated, and underflow classzone_idx.

To sum up, the bug occurs in configurations with ZONE_MOVABLE (such as when
booted with the movable_node parameter) and only in situations when
buffer_heads_over_limit is true, and there's an allocation with __GFP_MOVABLE
and without __GFP_HIGHMEM performing direct reclaim.

This patch makes sure that classzone_idx starts with the correct zone.

Mainline has been affected in versions 4.6 and 4.7, but the culprit commit has
been also included in stable trees.
In mainline, this has been fixed accidentally as part of 34-patch series (plus
follow-up fixes) "Move LRU page reclaim from zones to nodes", which makes the
mainline commit unsuitable for stable backport, unfortunately.

Fixes: 7bf52fb891b6 ("mm: vmscan: reclaim highmem zone if buffer_heads is over limit")
Obsoleted-by: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node basis")
Debugged-by: Michal Hocko <mhocko@xxxxxxx>
Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx>
Cc: Minchan Kim <minchan@xxxxxxxxxx>
Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
Acked-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
Acked-by: Michal Hocko <mhocko@xxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

---
 mm/vmscan.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2529,7 +2529,7 @@ static bool shrink_zones(struct zonelist
 		if (!populated_zone(zone))
 			continue;
 
-		classzone_idx = requested_highidx;
+		classzone_idx = gfp_zone(sc->gfp_mask);
 		while (!populated_zone(zone->zone_pgdat->node_zones +
 							classzone_idx))
 			classzone_idx--;


Patches currently in stable-queue which might be from vbabka@xxxxxxx are

queue-4.4/mm-fix-classzone_idx-underflow-in-shrink_zones.patch

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>