On 30.07.2018 14:05, Michal Hocko wrote: > On Mon 30-07-18 13:53:06, David Hildenbrand wrote: >> On 30.07.2018 13:30, Michal Hocko wrote: >>> On Fri 27-07-18 18:54:54, David Hildenbrand wrote: >>>> Right now, struct pages are inititalized when memory is onlined, not >>>> when it is added (since commit d0dc12e86b31 ("mm/memory_hotplug: optimize >>>> memory hotplug")). >>>> >>>> remove_memory() will call arch_remove_memory(). Here, we usually access >>>> the struct page to get the zone of the pages. >>>> >>>> So effectively, we access stale struct pages in case we remove memory that >>>> was never onlined. So let's simply inititalize them earlier, when the >>>> memory is added. We only have to take care of updating the zone once we >>>> know it. We can use a dummy zone for that purpose. >>> >>> I have considered something like this when I was reworking memory >>> hotplug to not associate struct pages with zone before onlining and I >>> considered this to be rather fragile. I would really not like to get >>> back to that again if possible. >>> >>>> So effectively, all pages will already be initialized and set to >>>> reserved after memory was added but before it was onlined (and even the >>>> memblock is added). We only inititalize pages once, to not degrade >>>> performance. >>> >>> To be honest, I would rather see d0dc12e86b31 reverted. It is late in >>> the release cycle and if the patch is buggy then it should be reverted >>> rather than worked around. I found the optimization not really >>> convincing back then and this is still the case TBH. >>> >> >> If I am not wrong, that's already broken in 4.17, no? What about that? > > Ohh, I thought this was merged in 4.18. > $ git describe --contains d0dc12e86b31 --match="v*" > v4.17-rc1~99^2~44 > > proves me wrong. This means that the fix is not so urgent as I thought. > If you can figure out a reasonable fix then it should be preferable to > the revert. > > Fake zone sounds too hackish to me though. > If I am not wrong, that's the same we had before d0dc12e86b31 but now it is explicit and only one single value for all kernel configs ("ZONE_NORMAL"). Before d0dc12e86b31, struct pages were initialized to 0. So it was (depending on the config) ZONE_DMA, ZONE_DMA32 or ZONE_NORMAL. Now the value is random and might not even be a valid zone. -- Thanks, David / dhildenb