Re: [PATCH] memory_hotplug: fix the panic when memory end is not on the section boundary

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I hope it's still possible to revive this thread. Please find my comments below.

On 12.09.2018 16:40, Pasha Tatashin wrote:


On 9/12/18 10:27 AM, Gerald Schaefer wrote:
On Wed, 12 Sep 2018 15:39:33 +0200
Michal Hocko <mhocko@xxxxxxxxxx> wrote:

On Wed 12-09-18 15:03:56, Gerald Schaefer wrote:
[...]
BTW, those sysfs attributes are world-readable, so anyone can trigger
the panic by simply reading them, or just run lsmem (also available for
x86 since util-linux 2.32). OK, you need a special not-memory-block-aligned
mem= parameter and DEBUG_VM for poison check, but w/o DEBUG_VM you would
still access uninitialized struct pages. This sounds very wrong, and I
think it really should be fixed.

Ohh, absolutely. Nobody is questioning that. The thing is that the
code has been likely always broken. We just haven't noticed because
those unitialized parts where zeroed previously. Now that the implicit
zeroying is gone it is just visible.

All that I am arguing is that there are many places which assume
pageblocks to be fully initialized and plugging one place that blows up
at the time is just whack a mole. We need to address this much earlier.
E.g. by allowing only full pageblocks when adding a memory range.

Just to make sure we are talking about the same thing: when you say
"pageblocks", do you mean the MAX_ORDER_NR_PAGES / pageblock_nr_pages
unit of pages, or do you mean the memory (hotplug) block unit?

 From early discussion, it was about pageblock_nr_pages not about
memory_block_size_bytes


I do not see any issue here with MAX_ORDER_NR_PAGES / pageblock_nr_pages
pageblocks, and if there was such an issue, of course you are right that
this would affect many places. If there was such an issue, I would also
assume that we would see the new page poison warning in many other places.

The bug that Mikhails patch would fix only affects code that operates
on / iterates through memory (hotplug) blocks, and that does not happen
in many places, only in the two functions that his patch fixes.

Just to be clear, so memory is pageblock_nr_pages aligned, yet
memory_block are larger and panic is still triggered?

I ask, because 3075M is not 128M aligned.


When you say "address this much earlier", do you mean changing the way
that free_area_init_core()/memmap_init() initialize struct pages, i.e.
have them not use zone->spanned_pages as limit, but rather align that
up to the memory block (not pageblock) boundary?


This was my initial proposal, to fix memmap_init() and initialize struct
pages beyond the "end", and before the "start" to cover the whole
section. But, I think Michal suggested (and he might correct me) to
simply ignore unaligned memory to section memory much earlier: so
anything that does not align to sparse order is not added at all to the
system.


I tried both approaches but each of them has issues.

First I tried to ignore unaligned memory early by adjusting memory_end value. But the thing is that kernel mem parameter parsing and memory_end calculation take place in the architecture code and adjusting it afterwards in common code might be too late in my view. Also with this approach we might lose the memory up to the entire section(256Mb on s390) just because of unfortunate alignment.

Another approach was "to fix memmap_init() and initialize struct pages beyond the end". Since struct pages are allocated section-wise we can try to round the size parameter passed to the memmap_init() function up to the section boundary thus forcing the mapping initialization for the entire section. But then it leads to another VM_BUG_ON panic due to zone_spans_pfn() sanity check triggered for the first page of each page block from set_pageblock_migratetype() function.
    page dumped because: VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn))
Call Trace: ([<00000000003013f8>] set_pfnblock_flags_mask+0xe8/0x140) [<00000000003014aa>] set_pageblock_migratetype+0x5a/0x70 [<0000000000bef706>] memmap_init_zone+0x25e/0x2e0 [<00000000010fc3d8>] free_area_init_node+0x530/0x558 [<00000000010fcf02>] free_area_init_nodes+0x81a/0x8f0 [<00000000010e7fdc>] paging_init+0x124/0x130 [<00000000010e4dfa>] setup_arch+0xbf2/0xcc8 [<00000000010de9e6>] start_kernel+0x7e/0x588 [<000000000010007c>] startup_continue+0x7c/0x300 INFO: lockdep is turned off. Last Breaking-Event-Address:
      [<00000000003013f8>] set_pfnblock_flags_mask+0xe8/0x140
We might ignore this check for the struct pages beyond the "end" but I'm not sure about further implications. Why don't we stay for now with my original proposal fixing specific functions for memory hotplug sysfs handlers. Please, tell me what you think.

Thanks,
Mikhail Zaslonko




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux