On Thu, Mar 28, 2019 at 04:31:44PM +0100, David Hildenbrand wrote: > Correct me if I am wrong. I think I was confused - vmemmap data is still > allocated *per memory block*, not for the whole added memory, correct? No, vmemap data is allocated per memory-resource added. In case a DIMM, would be a DIMM, in case a qemu memory-device, would be that memory-device. That is counting that ACPI does not split the DIMM/memory-device in several memory resources. If that happens, then acpi_memory_enable_device() calls __add_memory for every memory-resource, which means that the vmemmap data will be allocated per memory-resource. I did not see this happening though, and I am not sure under which circumstances can happen (I have to study the ACPI code a bit more). The problem with allocating vmemmap data per memblock, is the fragmentation. Let us say you do the following: * memblock granularity 128M (qemu) object_add memory-backend-ram,id=ram0,size=256M (qemu) device_add pc-dimm,id=dimm0,memdev=ram0,node=1 This will create two memblocks (2 sections), and if we allocate the vmemmap data for each corresponding section within it section(memblock), you only get 126M contiguous memory. So, the taken approach is to allocate the vmemmap data corresponging to the whole DIMM/memory-device/memory-resource from the beginning of its memory. In the example from above, the vmemmap data for both sections is allocated from the beginning of the first section: memmap array takes 2MB per section, so 512 pfns. If we add 2 sections: [ pfn#0 ] \ [ ... ] | vmemmap used for memmap array [pfn#1023 ] / [pfn#1024 ] \ [ ... ] | used as normal memory [pfn#65536] / So, out of 256M, we get 252M to use as a real memory, as 4M will be used for building the memmap array. Actually, it can happen that depending on how big a DIMM/memory-device is, the first/s memblock is fully used for the memmap array (of course, this can only be seen when adding a huge DIMM/memory-device). -- Oscar Salvador SUSE L3