Re: [RFC v5 0/3] mm: make swapin readahead to gain more thp performance

Rik van Riel <riel@xxxxxxxxxx> · Thu, 25 Feb 2016 17:35:50 -0500

On Wed, 2016-02-24 at 23:36 -0800, Hugh Dickins wrote:
> 
> Doesn't this imply that __collapse_huge_page_swapin() will initiate
> all
> the necessary swapins for a THP, then (given the
> FAULT_FLAG_ALLOW_RETRY)
> not wait for them to complete, so khugepaged will give up on that
> extent
> and move on to another; then after another full circuit of all the
> mms
> it needs to examine, it will arrive back at this extent and build a
> THP
> from the swapins it arranged last time.
> 
> Which may work well when a system transitions from busy+swappingout
> to idle+swappingin, but isn't that rather a special case?  It feels
> (meaning, I've not measured at all) as if the inbetween busyish case
> will waste a lot of I/O and memory on swapins that have to be
> discarded
> again before khugepaged has made its sedate way back to slotting them
> in.
> 

There may be a fairly simple way to prevent
that from becoming an issue.

When khugepaged wakes up, it can check the
PGSWPOUT or even the PGSTEAL_* stats for
the system, and skip swapin readahead if
there was swapout activity (or any page
reclaim activity?) since the time it last
ran.

That way the swapin readahead will do
its thing when transitioning from
busy + swapout to idle + swapin, but not
while the system is under permanent memory
pressure.

Am I forgetting anything obvious?

Is this too aggressive?

Not aggressive enough?

Could PGPGOUT + PGSWPOUT be a useful
in-between between just PGSWPOUT or
PGSTEAL_*?

-- 
All rights reversed
Attachment:
signature.asc

Description: This is a digitally signed message part