On 9/12/18 10:27 AM, Gerald Schaefer wrote: > On Wed, 12 Sep 2018 15:39:33 +0200 > Michal Hocko <mhocko@xxxxxxxxxx> wrote: > >> On Wed 12-09-18 15:03:56, Gerald Schaefer wrote: >> [...] >>> BTW, those sysfs attributes are world-readable, so anyone can trigger >>> the panic by simply reading them, or just run lsmem (also available for >>> x86 since util-linux 2.32). OK, you need a special not-memory-block-aligned >>> mem= parameter and DEBUG_VM for poison check, but w/o DEBUG_VM you would >>> still access uninitialized struct pages. This sounds very wrong, and I >>> think it really should be fixed. >> >> Ohh, absolutely. Nobody is questioning that. The thing is that the >> code has been likely always broken. We just haven't noticed because >> those unitialized parts where zeroed previously. Now that the implicit >> zeroying is gone it is just visible. >> >> All that I am arguing is that there are many places which assume >> pageblocks to be fully initialized and plugging one place that blows up >> at the time is just whack a mole. We need to address this much earlier. >> E.g. by allowing only full pageblocks when adding a memory range. > > Just to make sure we are talking about the same thing: when you say > "pageblocks", do you mean the MAX_ORDER_NR_PAGES / pageblock_nr_pages > unit of pages, or do you mean the memory (hotplug) block unit? >From early discussion, it was about pageblock_nr_pages not about memory_block_size_bytes > > I do not see any issue here with MAX_ORDER_NR_PAGES / pageblock_nr_pages > pageblocks, and if there was such an issue, of course you are right that > this would affect many places. If there was such an issue, I would also > assume that we would see the new page poison warning in many other places. > > The bug that Mikhails patch would fix only affects code that operates > on / iterates through memory (hotplug) blocks, and that does not happen > in many places, only in the two functions that his patch fixes. Just to be clear, so memory is pageblock_nr_pages aligned, yet memory_block are larger and panic is still triggered? I ask, because 3075M is not 128M aligned. > > When you say "address this much earlier", do you mean changing the way > that free_area_init_core()/memmap_init() initialize struct pages, i.e. > have them not use zone->spanned_pages as limit, but rather align that > up to the memory block (not pageblock) boundary? > This was my initial proposal, to fix memmap_init() and initialize struct pages beyond the "end", and before the "start" to cover the whole section. But, I think Michal suggested (and he might correct me) to simply ignore unaligned memory to section memory much earlier: so anything that does not align to sparse order is not added at all to the system. I think Michal's proposal would simplify and strengthen memory mapping overall. Pavel