Let's document what ZONE_MOVABLE means, how it's used, and which special cases we have regarding unmovable pages (memory offlining vs. migration / allocations). Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxx> Cc: Michael S. Tsirkin <mst@xxxxxxxxxx> Cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx> Cc: Mike Rapoport <rppt@xxxxxxxxxx> Cc: Pankaj Gupta <pankaj.gupta.linux@xxxxxxxxx> Cc: Baoquan He <bhe@xxxxxxxxxx> Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> --- include/linux/mmzone.h | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index f6f884970511d..600d449e7d9e9 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -372,6 +372,40 @@ enum zone_type { */ ZONE_HIGHMEM, #endif + /* + * ZONE_MOVABLE is similar to ZONE_NORMAL, except that it *primarily* + * only contains movable pages. Main use cases are to make memory + * offlining more likely to succeed, and to locally limit unmovable + * allocations - e.g., to increase the number of THP/huge pages. + * Notable special cases are: + * + * 1. Pinned pages: (Long-term) pinning of movable pages might + * essentially turn such pages unmovable. Memory offlining might + * retry a long time. + * 2. memblock allocations: kernelcore/movablecore setups might create + * situations where ZONE_MOVABLE contains unmovable allocations + * after boot. Memory offlining and allocations fail early. + * 3. Memory holes: Such pages cannot be allocated. Applies only to + * boot memory, not hotplugged memory. Memory offlining and + * allocations fail early. + * 4. PG_hwpoison pages: While poisoned pages can be skipped during + * memory offlining, such pages cannot be allocated. + * 5. Unmovable PG_offline pages: In paravirtualized environments, + * hotplugged memory blocks might only partially be managed by the + * buddy (e.g., via XEN-balloon, Hyper-V balloon, virtio-mem). The + * parts not manged by the buddy are unmovable PG_offline pages. In + * some cases (virtio-mem), such pages can be skipped during + * memory offlining, however, cannot be moved/allocated. These + * techniques might use alloc_contig_range() to hide previously + * exposed pages from the buddy again (e.g., to implement some sort + * of memory unplug in virtio-mem). + * + * In general, no unmovable allocations that degrade memory offlining + * should end up in ZONE_MOVABLE. Allocators (like alloc_contig_range()) + * have to expect that migrating pages in ZONE_MOVABLE can fail (even + * if has_unmovable_pages() states that there are no unmovable pages, + * there can be false negatives). + */ ZONE_MOVABLE, #ifdef CONFIG_ZONE_DEVICE ZONE_DEVICE, -- 2.26.2