On 29.08.19 10:36, Michal Hocko wrote: > On Mon 26-08-19 12:10:06, David Hildenbrand wrote: >> Working on virtio-mem, I was able to trigger a kernel BUG (with debug >> options enabled) when removing memory that was never onlined. I was able >> to reproduce with DIMMs. As far as I can see the same can also happen >> without debug configs enabled, if we're unlucky and the uninitialized >> memmap contains selected garbage . > > Could you be more specific please? > There is more detail in the patches. Also see the new series for more details. When shrinking zones we look at all spanned pages to shrink as far as possible - to skip over holes. There, it might happen that we hit uninitialized memmaps. While "pfn_valid()" is true, the memmap is not initialized and doing a pfn_to_nid() or page_zone() will end badly. For !ZONE_DEVICE we can check SECTION_IS_ONLINE. That is a guarantee that the whole memmap of the section is valid. For ZONE_DEVICE we don't have anything similar. As we allow subsection hotplug there (and therefore only subsection of the memmap are initialized) - we might even step on uninitialized memmaps without being able to detect this. -- Thanks, David / dhildenb