Re: [PATCH 4/4] mm/page_alloc: no need to ClearPageReserved on giving page to buddy system

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jun 29, 2024 at 05:45:53PM +0100, Matthew Wilcox wrote:
>On Sat, Jun 29, 2024 at 05:28:34PM +0100, Matthew Wilcox wrote:
>> On Sat, Jun 29, 2024 at 08:44:11AM +0000, Wei Yang wrote:
>> > Per my understanding, prefetchw() is trying to load data to cache before we
>> > really accessing it. By doing so, we won't hit a cache miss when we really
>> > need it.
>> 
>> Yes, but the CPU can also do this by itself without needing an explicit
>> hint from software.  It can notice that we have a loop that's accessing
>> successive cachelines for write.  This is approximately the easiest
>> prefetcher to design.

This is my first intuition on reading the code, while I *believed* there is a
reason for the prefetchw() to be put here.

>
>I tracked down prefetchw() being added:
>
>commit 3b901ea58a56
>Author: Josh Aas <josha@xxxxxxx>
>Date:   Mon Aug 23 21:26:54 2004 -0700
>
>    [PATCH] improve speed of freeing bootmem
>
>    Attached is a patch that greatly improves the speed of freeing boot memory.
>     On ia64 machines with 2GB or more memory (I didn't test with less, but I
>    can't imagine there being a problem), the speed improvement is about 75%
>    for the function free_all_bootmem_core.  This translates to savings on the
>    order of 1 minute / TB of memory during boot time.  That number comes from
>    testing on a machine with 512GB, and extrapolating based on profiling of an
>    unpatched 4TB machine.  For 4 and 8 TB machines, the time spent in this
>    function is about 1 minutes/TB, which is painful especially given that
>    there is no indication of what is going on put to the console (this issue
>    to possibly be addressed later).
>
>    The basic idea is to free higher order pages instead of going through every
>    single one.  Also, some unnecessary atomic operations are done away with
>    and replaced with non-atomic equivalents, and prefetching is done where it
>    helps the most.  For a more in-depth discusion of this patch, please see
>    the linux-ia64 archives (topic is "free bootmem feedback patch").
>
>(quoting the entire commit message because it's buried in linux-fullhistory,
>being a pre-git patch).  For the thread he's referring to, see
>https://lore.kernel.org/linux-ia64/40F46962.4090604@xxxxxxx/
>

Thanks for digging the ancient history. And now I know the linux-fullhistory tree
:-)

>Itanium CPUs of this era had no prefetchers.

Oops, so sad to hear it.

So it looks the most helpful change is:

  * free higher order pages
  * non-atomic equivalent

A quick test on my 6G qemu virtual machine shows the prefetchw() here seems
not helpful. After removal, the meminit process even a little bit faster. From
135ms to 121ms as the sum of 3 times bootup test.

If you think it is ok, I would wrap up one patch on current mm-stable for
more audience to review. 

-- 
Wei Yang
Help you, Help me




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux