On Tue 09-06-20 18:54:51, Daniel Jordan wrote: [...] > @@ -1390,6 +1391,15 @@ static unsigned long probe_memory_block_size(void) > goto done; > } > > + /* > + * Use max block size to minimize overhead on bare metal, where > + * alignment for memory hotplug isn't a concern. This really begs a clarification why this is not a concern. Bare metal can see physical memory hotadd as well. I just suspect that you do not consider that to be very common so it is not a big deal? And I would tend to agree but still we are just going to wait until first user stumbles over this. Btw. memblock interface just doesn't scale and it is a terrible interface for large machines and for the memory hotplug in general (just look at ppc and their insanely small memblocks). Most usecases I have seen simply want to either offline some portion of memory without a strong requirement of the physical memory range as long as it is from a particular node or simply offline and remove the full node. I believe that we should think about a future interface rather than trying to ducktape the blocksize anytime it causes problems. I would be even tempted to simply add a kernel command line option memory_hotplug=disable,legacy,new_shiny for disable it would simply drop all the sysfs crud and speed up boot for most users who simply do not care about memory hotplug. new_shiny would ideally provide an interface that would either export logically hotplugable memory ranges (e.g. DIMMs) or a query/action interface which accepts physical ranges as input. Having gazillions of sysfs files is simply unsustainable. -- Michal Hocko SUSE Labs