On Tue, Aug 04, 2020 at 09:24:08AM +0200, David Hildenbrand wrote: > Let's document what ZONE_MOVABLE means, how it's used, and which special > cases we have regarding unmovable pages (memory offlining vs. migration / > allocations). > > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: Michal Hocko <mhocko@xxxxxxxx> > Cc: Michael S. Tsirkin <mst@xxxxxxxxxx> > Cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx> > Cc: Mike Rapoport <rppt@xxxxxxxxxx> > Cc: Pankaj Gupta <pankaj.gupta.linux@xxxxxxxxx> > Cc: Baoquan He <bhe@xxxxxxxxxx> > Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> Several nits below, othersize Acked-by: Mike Rapoport <rppt@xxxxxxxxxxxxx> > --- > include/linux/mmzone.h | 34 ++++++++++++++++++++++++++++++++++ > 1 file changed, 34 insertions(+) > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index f6f884970511d..600d449e7d9e9 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -372,6 +372,40 @@ enum zone_type { > */ > ZONE_HIGHMEM, > #endif > + /* > + * ZONE_MOVABLE is similar to ZONE_NORMAL, except that it *primarily* > + * only contains movable pages. Main use cases are to make memory "Primarily only" sounds awkward. Maybe ... except that it only contains movable pages with few exceptional cases described below. And then Main use cases for ZONE_MOVABLE are ... > + * offlining more likely to succeed, and to locally limit unmovable > + * allocations - e.g., to increase the number of THP/huge pages. > + * Notable special cases are: > + * > + * 1. Pinned pages: (Long-term) pinning of movable pages might ^long, capital L looked out of place for me > + * essentially turn such pages unmovable. Memory offlining might > + * retry a long time. > + * 2. memblock allocations: kernelcore/movablecore setups might create > + * situations where ZONE_MOVABLE contains unmovable allocations > + * after boot. Memory offlining and allocations fail early. > + * 3. Memory holes: Such pages cannot be allocated. Applies only to > + * boot memory, not hotplugged memory. Memory offlining and > + * allocations fail early. I would clarify where page struct for abscent memory come from > + * 4. PG_hwpoison pages: While poisoned pages can be skipped during > + * memory offlining, such pages cannot be allocated. > + * 5. Unmovable PG_offline pages: In paravirtualized environments, > + * hotplugged memory blocks might only partially be managed by the > + * buddy (e.g., via XEN-balloon, Hyper-V balloon, virtio-mem). The > + * parts not manged by the buddy are unmovable PG_offline pages. In > + * some cases (virtio-mem), such pages can be skipped during > + * memory offlining, however, cannot be moved/allocated. These > + * techniques might use alloc_contig_range() to hide previously > + * exposed pages from the buddy again (e.g., to implement some sort > + * of memory unplug in virtio-mem). > + * > + * In general, no unmovable allocations that degrade memory offlining > + * should end up in ZONE_MOVABLE. Allocators (like alloc_contig_range()) > + * have to expect that migrating pages in ZONE_MOVABLE can fail (even > + * if has_unmovable_pages() states that there are no unmovable pages, > + * there can be false negatives). > + */ > ZONE_MOVABLE, > #ifdef CONFIG_ZONE_DEVICE > ZONE_DEVICE, > -- > 2.26.2 > -- Sincerely yours, Mike.