On Tue, 7 May 2019 13:18:06 -0400 Sasha Levin <sashal@xxxxxxxxxx> wrote: > On Tue, May 07, 2019 at 10:15:19AM -0700, Linus Torvalds wrote: > >On Tue, May 7, 2019 at 10:02 AM Sasha Levin <sashal@xxxxxxxxxx> wrote: > >> > >> I got it wrong then. I'll fix it up and get efad4e475c31 in instead. > > > >Careful. That one had a bug too, and we have 891cb2a72d82 ("mm, > >memory_hotplug: fix off-by-one in is_pageblock_removable"). > > > >All of these were *horribly* and subtly buggy, and might be > >intertwined with other issues. And only trigger on a few specific > >machines where the memory map layout is just right to trigger some > >special case or other, and you have just the right config. > > > >It might be best to verify with Michal Hocko. Michal? > > Michal, is there a testcase I can plug into kselftests to make sure we > got this right (and don't regress)? We care a lot about memory hotplug > working right. We hit the panics on s390 with special z/VM memory layout, but they both can be triggered simply by using mem= kernel parameter (and CONFIG_DEBUG_VM_PGFLAGS=y). With "mem=3075M" (and w/o the commits efad4e475c31 + 24feb47c5fa5), it can be triggered by reading from /sys/devices/system/memory/memory<x>/valid_zones, or from /sys/devices/system/memory/memory<x>/removable, with <x> being the last memory block. This is with 256MB section size and memory block size. On LPAR, with 256MB section size and 1GB memory block size, for some reason the "removable" issue doesn't trigger, only the "valid_zones" issue. Using lsmem will also trigger it, as it reads both the valid_zones and the removable attribute for all memory blocks. So, a test with not-section-aligned mem= parameter and using lsmem could be an option. Regards, Gerald