On Tue, Jul 02, 2024 at 08:57:57AM +0200, David Hildenbrand wrote: >On 02.07.24 04:09, Wei Yang wrote: >> The prefetchw() is introduced from an ancient patch[1]. >> >> The change log says: >> >> The basic idea is to free higher order pages instead of going >> through every single one. Also, some unnecessary atomic operations >> are done away with and replaced with non-atomic equivalents, and >> prefetching is done where it helps the most. For a more in-depth >> discusion of this patch, please see the linux-ia64 archives (topic >> is "free bootmem feedback patch"). >> >> So there are several changes improve the bootmem freeing, in which the >> most basic idea is freeing higher order pages. And as Matthew says, >> "Itanium CPUs of this era had no prefetchers." >> >> I did 10 round bootup tests before and after this change, the data >> doesn't prove prefetchw() help speeding up bootmem freeing. The sum of >> the 10 round bootmem freeing time after prefetchw() removal even 5.2% >> faster than before. > >I suspect this is noise, though. > >> >> [1]: https://lore.kernel.org/linux-ia64/40F46962.4090604@xxxxxxx/ >> >> Signed-off-by: Wei Yang <richard.weiyang@xxxxxxxxx> >> Suggested-by: Matthew Wilcox <willy@xxxxxxxxxxxxx> >> CC: David Hildenbrand <david@xxxxxxxxxx> >> >> --- >> The patch is based on mm-stable with David's change. >> --- >> mm/page_alloc.c | 13 ++----------- >> 1 file changed, 2 insertions(+), 11 deletions(-) >> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 116ee33fd1ce..c46aedfc9a12 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -1236,16 +1236,11 @@ void __meminit __free_pages_core(struct page *page, unsigned int order, >> */ >> if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG) && >> unlikely(context == MEMINIT_HOTPLUG)) { >> - prefetchw(p); >> - for (loop = 0; loop < (nr_pages - 1); loop++, p++) { >> - prefetchw(p + 1); >> + for (loop = 0; loop < nr_pages; loop++, p++) { >> VM_WARN_ON_ONCE(PageReserved(p)); >> __ClearPageOffline(p); >> set_page_count(p, 0); >> } > >Something like: > >for (;;) { > ... > if (++loop >= nr_pages) > break; > p++; >} > So you prefer to have another version with this format? Sth like this? diff --git a/mm/page_alloc.c b/mm/page_alloc.c index c46aedfc9a12..5235015eba3d 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1224,7 +1224,7 @@ void __meminit __free_pages_core(struct page *page, unsigned int order, { unsigned int nr_pages = 1 << order; struct page *p = page; - unsigned int loop; + unsigned int loop = 0; /* * When initializing the memmap, __init_single_page() sets the refcount @@ -1236,10 +1236,13 @@ void __meminit __free_pages_core(struct page *page, unsigned int order, */ if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG) && unlikely(context == MEMINIT_HOTPLUG)) { - for (loop = 0; loop < nr_pages; loop++, p++) { + for (;;) { VM_WARN_ON_ONCE(PageReserved(p)); __ClearPageOffline(p); set_page_count(p, 0); + if (++loop >= nr_pages) + break; + p++; } /* @@ -1250,9 +1253,12 @@ void __meminit __free_pages_core(struct page *page, unsigned int order, debug_pagealloc_map_pages(page, nr_pages); adjust_managed_page_count(page, nr_pages); } else { - for (loop = 0; loop < nr_pages; loop++, p++) { + for (;;) { __ClearPageReserved(p); set_page_count(p, 0); + if (++loop >= nr_pages) + break; + p++; } /* memblock adjusts totalram_pages() manually. */ > >Might generate slightly better code, because we know that we execute the loop >body at least once. We use that in set_ptes(), for example. > >> - VM_WARN_ON_ONCE(PageReserved(p)); >> - __ClearPageOffline(p); >> - set_page_count(p, 0); >> /* >> * Freeing the page with debug_pagealloc enabled will try to >> @@ -1255,14 +1250,10 @@ void __meminit __free_pages_core(struct page *page, unsigned int order, >> debug_pagealloc_map_pages(page, nr_pages); >> adjust_managed_page_count(page, nr_pages); >> } else { >> - prefetchw(p); >> - for (loop = 0; loop < (nr_pages - 1); loop++, p++) { >> - prefetchw(p + 1); >> + for (loop = 0; loop < nr_pages; loop++, p++) { >> __ClearPageReserved(p); >> set_page_count(p, 0); >> } >> - __ClearPageReserved(p); >> - set_page_count(p, 0); >> /* memblock adjusts totalram_pages() manually. */ >> atomic_long_add(nr_pages, &page_zone(page)->managed_pages); > >Much better > >Reviewed-by: David Hildenbrand <david@xxxxxxxxxx> > >-- >Cheers, > >David / dhildenb -- Wei Yang Help you, Help me