Hi Linus, I attached a temporary fix, which I could not test, as I was unable to reproduce the problem, but it should fix the issue. Reverting "f7f99100d8d9 mm: stop zeroing memory during allocation in vmemmap" would introduce a significant boot performance regression, as we would zero the whole memmap twice during boot. Later, I will introduce a more detailed fix that will get rid of zero_resv_unavail() entirely, and instead will zero skipped struct pages in memmap_init_zone(), where it should be done. Thank you, Pavel On Fri, Jul 13, 2018 at 11:25 PM Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Fri, Jul 13, 2018 at 8:04 PM Pavel Tatashin > <pasha.tatashin@xxxxxxxxxx> wrote: > > > > > You can't just memset() the 'struct page' to zero after it's been set up. > > > > That should not be happening, unless there is a bug. > > Well, it does seem to happen. My memory stress-tester has been running > for about half an hour now with the revert I posted - it used to > trigger the problem in maybe ~5 minutes before. > > So I do think that revert fixes it for me. No guarantees, but since I > figured out how to trigger it, it's been fairly reliable. > > > We want to zero those struct pages so we do not have uninitialized > > data accessed by various parts of the code that rounds down large > > pages and access the first page in section without verifying that the > > page is valid. The example of this is described in commit that > > introduced zero_resv_unavail() > > I'm attaching the relevant (?) parts of dmesg, which has the node > ranges, maybe you can see what the problem with the code is. > > (NOTE! This dmesg is with that "mem=6G" command line option, which causes that > > e820: remove [mem 0x180000000-0xfffffffffffffffe] usable > > line - that's just because it's my stress-test boot. It happens with > or without it, but without the "mem=6G" it took days to trigger). > > I'm more than willing to test patches (either for added information or > for testing fixes), although I think I'm getting off the computer for > today. > > Linus
From 95259841ef79cc17c734a994affa3714479753e3 Mon Sep 17 00:00:00 2001 From: Pavel Tatashin <pasha.tatashin@xxxxxxxxxx> Date: Sat, 14 Jul 2018 09:15:07 -0400 Subject: [PATCH] mm: zero unavailable pages before memmap init We must zero struct pages for memory that is not backed by physical memory, or kernel does not have access to. Recently, there was a change which zeroed all memmap for all holes in e820. Unfortunately, it introduced a bug that is discussed here: https://www.spinics.net/lists/linux-mm/msg156764.html Linus, also saw this bug on his machine, and confirmed that pulling commit 124049decbb1 ("x86/e820: put !E820_TYPE_RAM regions into memblock.reserved") fixes the issue. The problem is that we incorrectly zero some struct pages after they were setup. The fix is to zero unavailable struct pages prior to initializing of struct pages. A more detailed fix should come later that would avoid double zeroing cases: one in __init_single_page(), the other one in zero_resv_unavail(). Fixes: 124049decbb1 ("x86/e820: put !E820_TYPE_RAM regions into memblock.reserved") Signed-off-by: Pavel Tatashin <pasha.tatashin@xxxxxxxxxx> --- mm/page_alloc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 1521100f1e63..5d800d61ddb7 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6847,6 +6847,7 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn) /* Initialise every node */ mminit_verify_pageflags_layout(); setup_nr_node_ids(); + zero_resv_unavail(); for_each_online_node(nid) { pg_data_t *pgdat = NODE_DATA(nid); free_area_init_node(nid, NULL, @@ -6857,7 +6858,6 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn) node_set_state(nid, N_MEMORY); check_for_memory(pgdat, nid); } - zero_resv_unavail(); } static int __init cmdline_parse_core(char *p, unsigned long *core, @@ -7033,9 +7033,9 @@ void __init set_dma_reserve(unsigned long new_dma_reserve) void __init free_area_init(unsigned long *zones_size) { + zero_resv_unavail(); free_area_init_node(0, zones_size, __pa(PAGE_OFFSET) >> PAGE_SHIFT, NULL); - zero_resv_unavail(); } static int page_alloc_cpu_dead(unsigned int cpu) -- 2.18.0