On 31.07.19 14:43, Michal Hocko wrote: > On Wed 31-07-19 14:22:13, David Hildenbrand wrote: >> Each memory block spans the same amount of sections/pages/bytes. The size >> is determined before the first memory block is created. No need to store >> what we can easily calculate - and the calculations even look simpler now. > > While this cleanup helps a bit, I am not sure this is really worth > bothering. I guess we can agree when I say that the memblock interface > is suboptimal (to put it mildly). Shouldn't we strive for making it > a real hotplug API in the future? What do I mean by that? Why should > be any memblock fixed in size? Shouldn't we have use hotplugable units > instead (aka pfn range that userspace can work with sensibly)? Do we > know of any existing userspace that would depend on the current single > section res. 2GB sized memblocks? Short story: It is already ABI (e.g., /sys/devices/system/memory/block_size_bytes) - around since 2005 (!) - since we had memory block devices. I suspect that it is mainly manually used. But I might be wrong. Long story: How would you want to number memory blocks? At least no longer by phys index. For now, memory blocks are ordered and numbered by their block id. Admins might want to online parts of a DIMM MOVABLE/NORMAL, to more reliably use huge pages but still have enough space for kernel memory (e.g., page tables). They might like that a DIMM is actually a set of memory blocks instead of one big chunk. IOW: You can consider it a restriction to add e.g., DIMMs only in one bigger chunks. > > All that being said, I do not oppose to the patch but can we start > thinking about the underlying memblock limitations rather than micro > cleanups? I am pro cleaning up what we have right now, not expect it to eventually change some-when in the future. (btw, I highly doubt it will change) -- Thanks, David / dhildenb