Re: [PATCH] mm/page_alloc: remove prefetchw() on freeing page to buddy system

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02.07.24 04:09, Wei Yang wrote:
The prefetchw() is introduced from an ancient patch[1].

The change log says:

     The basic idea is to free higher order pages instead of going
     through every single one.  Also, some unnecessary atomic operations
     are done away with and replaced with non-atomic equivalents, and
     prefetching is done where it helps the most.  For a more in-depth
     discusion of this patch, please see the linux-ia64 archives (topic
     is "free bootmem feedback patch").

So there are several changes improve the bootmem freeing, in which the
most basic idea is freeing higher order pages. And as Matthew says,
"Itanium CPUs of this era had no prefetchers."

I did 10 round bootup tests before and after this change, the data
doesn't prove prefetchw() help speeding up bootmem freeing. The sum of
the 10 round bootmem freeing time after prefetchw() removal even 5.2%
faster than before.

I suspect this is noise, though.


[1]: https://lore.kernel.org/linux-ia64/40F46962.4090604@xxxxxxx/

Signed-off-by: Wei Yang <richard.weiyang@xxxxxxxxx>
Suggested-by: Matthew Wilcox <willy@xxxxxxxxxxxxx>
CC: David Hildenbrand <david@xxxxxxxxxx>

---
The patch is based on mm-stable with David's change.
---
  mm/page_alloc.c | 13 ++-----------
  1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 116ee33fd1ce..c46aedfc9a12 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1236,16 +1236,11 @@ void __meminit __free_pages_core(struct page *page, unsigned int order,
  	 */
  	if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG) &&
  	    unlikely(context == MEMINIT_HOTPLUG)) {
-		prefetchw(p);
-		for (loop = 0; loop < (nr_pages - 1); loop++, p++) {
-			prefetchw(p + 1);
+		for (loop = 0; loop < nr_pages; loop++, p++) {
  			VM_WARN_ON_ONCE(PageReserved(p));
  			__ClearPageOffline(p);
  			set_page_count(p, 0);
  		}

Something like:

for (;;) {
	...
	if (++loop >= nr_pages)
		break;
	p++;
}


Might generate slightly better code, because we know that we execute the loop body at least once. We use that in set_ptes(), for example.

-		VM_WARN_ON_ONCE(PageReserved(p));
-		__ClearPageOffline(p);
-		set_page_count(p, 0);
/*
  		 * Freeing the page with debug_pagealloc enabled will try to
@@ -1255,14 +1250,10 @@ void __meminit __free_pages_core(struct page *page, unsigned int order,
  		debug_pagealloc_map_pages(page, nr_pages);
  		adjust_managed_page_count(page, nr_pages);
  	} else {
-		prefetchw(p);
-		for (loop = 0; loop < (nr_pages - 1); loop++, p++) {
-			prefetchw(p + 1);
+		for (loop = 0; loop < nr_pages; loop++, p++) {
  			__ClearPageReserved(p);
  			set_page_count(p, 0);
  		}
-		__ClearPageReserved(p);
-		set_page_count(p, 0);
/* memblock adjusts totalram_pages() manually. */
  		atomic_long_add(nr_pages, &page_zone(page)->managed_pages);

Much better

Reviewed-by: David Hildenbrand <david@xxxxxxxxxx>

--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux