Re: [RFC PATCH 1/4] mcpage: add size/mask/shift definition for multiple consecutive page

"Yin, Fengwei" <fengwei.yin@xxxxxxxxx> · Tue, 10 Jan 2023 10:53:03 +0800

On 1/9/2023 9:24 PM, Matthew Wilcox wrote:
> On Mon, Jan 09, 2023 at 03:22:29PM +0800, Yin Fengwei wrote:
>> The idea of the multiple consecutive page (abbr as "mcpage") is using
>> collection of physical contiguous 4K page other than huge page for
>> anonymous mapping.
> 
> This is what folios are for.  You have an interesting demonstration
> here that shows that moving to larger folios for anonymous memory
> is worth doing (thank you!) but you're missing several of the advantages
> of folios by going off and doing your own thing.
Yes. Folio and mcpage share some advantages.

> 
>> The size of mcpage can be configured. The default value of 16K size is
>> just picked up arbitrarily. User should choose the value according to the
>> result of tuning their workload with different mcpage size.
> 
> Uh, no.  We don't do these kinds of config options any more (or boot-time
> options as you mention later).  The size of a folio allocated for a given
> VMA should be adaptive based on observing how the program is using memory.
> There will likely be many different sizes of folio present in a given VMA.
I had two thoughts for adaptive folio size:
1. It could have high tail latency to allocate folio with large size. Which
   is not appreciated by some workloads. It may be good to allow user to
   define the size?
2. Difference size of folio in system may make whole memory fragment?

> 
>> To have physical contiguous pages, high order pages is allocated (order
>> is calculated according to mcpage size). Then the high order page will
>> be split. By doing this, each sub page of mcpage is just normal 4K page.
>> The current kernel page management infrastructure is applied to "mc"
>> pages without any change.
> 
> This is somewhere that you're losing an advantage of folios.  By keeping
> all the pages together, they get managed as a single unit.  That shrinks
> the length of the LRU list and reduces lock contention.  It also reduces
> the number of cache lines which are modified as, eg, we only need to
> keep track of one dirty bit for many pages.
Yes. lru list/lock benefit is provided by folios.

For dirty bit, one dirty bit for many pages means just one dirty sub-page
of folios require all sub-pages need be writing out. It brings pressure
to storage. But Yes. Other bits can get benefit of less cache line
modification.

Regards
Yin, Fengwei

> 
>> To reduce the page fault number, multiple page table entries are populated
>> in one page fault with sub pages pfn of mcpage. This also brings a little
>> bit cost of memory consumption.
> 
> That needs to be done for folios.  It's a long way down my todo list,
> so if you wanted to take it on, it would be very much appreciated!
>