Re: [PATCH] mm: swap: async free swap slot cache entries

Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> · Fri, 22 Dec 2023 11:52:08 -0800

On Thu, 21 Dec 2023 22:25:39 -0800 Chris Li <chrisl@xxxxxxxxxx> wrote:

> We discovered that 1% swap page fault is 100us+ while 50% of
> the swap fault is under 20us.
> 
> Further investigation show that a large portion of the time
> spent in the free_swap_slots() function for the long tail case.
> 
> The percpu cache of swap slots is freed in a batch of 64 entries
> inside free_swap_slots(). These cache entries are accumulated
> from previous page faults, which may not be related to the current
> process.
> 
> Doing the batch free in the page fault handler causes longer
> tail latencies and penalizes the current process.
> 
> Move free_swap_slots() outside of the swapin page fault handler into an
> async work queue to avoid such long tail latencies.

This will require a larger amount of total work than the current
scheme.  So we're trading that off against better latency.

Why is this a good tradeoff?