On Tue, Aug 10, 2021 at 05:21:48PM +0200, David Hildenbrand wrote: > On 10.08.21 17:02, Kirill A. Shutemov wrote: > > On Tue, Aug 10, 2021 at 09:48:04AM +0200, David Hildenbrand wrote: > > > On 10.08.21 08:26, Kirill A. Shutemov wrote: > > > > UEFI Specification version 2.9 introduces concept of memory acceptance: > > > > Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, > > > > requiring memory to be accepted before it can be used by the guest. > > > > Accepting happens via a protocol specific for the Virtrual Machine > > > > platform. > > > > > > > > Accepting memory is costly and it makes VMM allocate memory for the > > > > accepted guest physical address range. It's better to postpone memory > > > > acceptation until memory is needed. It lowers boot time and reduces > > > > memory overhead. > > > > > > > > Support of such memory requires few changes in core-mm code: > > > > > > > > - memblock has to accept memory on allocation; > > > > > > > > - page allocator has to accept memory on the first allocation of the > > > > page; > > > > > > > > Memblock change is trivial. > > > > > > > > Page allocator is modified to accept pages on the first allocation. > > > > PageOffline() is used to indicate that the page requires acceptance. > > > > The flag currently used by hotplug and balloon. Such pages are not > > > > available to page allocator. > > > > > > > > An architecture has to provide three helpers if it wants to support > > > > unaccepted memory: > > > > > > > > - accept_memory() makes a range of physical addresses accepted. > > > > > > > > - maybe_set_page_offline() marks a page PageOffline() if it requires > > > > acceptance. Used during boot to put pages on free lists. > > > > > > > > - clear_page_offline() clears makes a page accepted and clears > > > > PageOffline(). > > > > > > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> > > > > --- > > > > mm/internal.h | 14 ++++++++++++++ > > > > mm/memblock.c | 1 + > > > > mm/page_alloc.c | 13 ++++++++++++- > > > > 3 files changed, 27 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/mm/internal.h b/mm/internal.h > > > > index 31ff935b2547..d2fc8a17fbe0 100644 > > > > --- a/mm/internal.h > > > > +++ b/mm/internal.h > > > > @@ -662,4 +662,18 @@ void vunmap_range_noflush(unsigned long start, unsigned long end); > > > > int numa_migrate_prep(struct page *page, struct vm_area_struct *vma, > > > > unsigned long addr, int page_nid, int *flags); > > > > +#ifndef CONFIG_UNACCEPTED_MEMORY > > > > +static inline void maybe_set_page_offline(struct page *page, unsigned int order) > > > > +{ > > > > +} > > > > + > > > > +static inline void clear_page_offline(struct page *page, unsigned int order) > > > > +{ > > > > +} > > > > + > > > > +static inline void accept_memory(phys_addr_t start, phys_addr_t end) > > > > +{ > > > > +} > > > > > > Can we find better fitting names for the first two? The function names are > > > way too generic. For example: > > > > > > accept_or_set_page_offline() > > > > > > accept_and_clear_page_offline() > > > > Sounds good. > > > > > I thought for a second if > > > PAGE_TYPE_OPS(Unaccepted, offline) > > > makes sense as well, not sure. > > > > I find Offline fitting the situation. Don't see a reason to add more > > terminology here. > > > > > Also, please update the description of PageOffline in page-flags.h to > > > include the additional usage with PageBuddy set at the same time. > > > > Okay. > > > > > I assume you don't have to worry about page_offline_freeze/thaw ... as we > > > only set PageOffline initially, but not later at runtime when other > > > subsystems (/proc/kcore) might stumble over it. > > > > I think so, but I would need to look at this code once again. > > > > Another thing to look into would be teaching makedumpfile via vmcoreinfo > about these special buddy pages: > > makedumpfile will naturally skip all PageOffline pages and skip PageBuddy > pages if requested to skip free pages. It detects these pages via the > mapcount value. You will want makedumpfile to treat them like PageOffline > pages: kernel/crash_core.c > > #define PAGE_BUDDY_MAPCOUNT_VALUE (~PG_buddy) > VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE); > > #define PAGE_OFFLINE_MAPCOUNT_VALUE (~PG_offline) > VMCOREINFO_NUMBER(PAGE_OFFLINE_MAPCOUNT_VALUE); > > We could export PAGE_BUDDY_OFFLINE_MAPCOUNT_VALUE or just compute it inside > makedumpfile from the other two values. Thanks, for digging it up. I'll look into makedumpfile, but it's not on top of my todo list, so may take a while. -- Kirill A. Shutemov