Re: [PATCH] mm: Compute mTHP order efficiently

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 17.09.24 05:55, Dev Jain wrote:

On 9/16/24 18:54, Matthew Wilcox wrote:
On Fri, Sep 13, 2024 at 02:49:02PM +0530, Dev Jain wrote:
We use pte_range_none() to determine whether contiguous PTEs are empty
for an mTHP allocation. Instead of iterating the while loop for every
order, use some information, which is the first set PTE found, from the
previous iteration, to eliminate some cases. The key to understanding
the correctness of the patch is that the ranges we want to examine
form a strictly decreasing sequence of nested intervals.
This is a lot more complicated.  Do you have any numbers that indicate
that it's faster?  Yes, it's fewer memory references, but you've gone
from a simple linear scan that's easy to prefetch to an exponential scan
that might confuse the prefetchers.

I do have some numbers, I tested with a simple program, and also used
ktime API, with the latter, enclosing from "order = highest_order(orders)"
till "pte_unmap(pte)" (enclosing the entire while loop), a rough average
estimate is that without the patch, it takes 1700 ns to execute, with the
patch, on an average it takes 80 - 100ns less. I cannot think of a good
testing program...

And that is likely what Willy is actually wondering about: does it have any real world impact or is the benefit just noise. :)

Change does not look too wild to me, but yes, it increases complexity.

--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux