On Mon 28-01-19 09:50:54, Andrew Morton wrote: > On Mon, 28 Jan 2019 15:45:04 +0100 Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > Mikhail has posted fixes for the two bugs quite some time ago [1]. I > > have pushed back on those fixes because I believed that it is much > > better to plug the problem at the initialization time rather than play > > whack-a-mole all over the hotplug code and find all the places which > > expect the full memory section to be initialized. We have ended up with > > 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full > > memory section") merged and cause a regression [2][3]. The reason is > > that there might be memory layouts when two NUMA nodes share the same > > memory section so the merged fix is simply incorrect. > > > > In order to plug this hole we really have to be zone range aware in > > those handlers. I have split up the original patch into two. One is > > unchanged (patch 2) and I took a different approach for `removable' > > crash. It would be great if Mikhail could test it still works for his > > memory layout. > > > > [1] http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@xxxxxxxxxxxxx > > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948 > > [3] http://lkml.kernel.org/r/20190125163938.GA20411@xxxxxxxxxxxxxx > > Any thoughts on which kernel version(s) need these patches? My remark in 2830bf6f05fb still holds : This has alwways been problem AFAIU. It just went unnoticed because we : have zeroed memmaps during allocation before f7f99100d8d9 ("mm: stop : zeroing memory during allocation in vmemmap") and so the above test : would simply skip these ranges as belonging to zone 0 or provided a : garbage. : : So I guess we do care for post f7f99100d8d9 kernels mostly and : therefore Fixes: f7f99100d8d9 ("mm: stop zeroing memory during : allocation in vmemmap") But, please let's wait for the patch 1 to be confirmed to fix the issue. -- Michal Hocko SUSE Labs