Re: [PATH stable 5.4] mm: swap: properly update readahead statistics in unuse_pte_range()

Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> · Fri, 3 Feb 2023 10:19:21 +0100

On Mon, Jan 30, 2023 at 03:28:23PM +0000, Luiz Capitulino wrote:
> From: Andrea Righi <andrea.righi@xxxxxxxxxxxxx>
> 
> Commit ebc5951eea499314f6fbbde20e295f1345c67330 upstream.
> 
> [ This fixes a performance issue we're seeing in AWS instances when
>   running swapoff and using the global readahead algorithm. For a
>   particular instance configuration, Without this fix I/O throughput
>   is very low during swapoff (about 15 MB/s) with this patch is
>   reaches 500 MB/s. Tested swapoff with different workloads with
>   this patch applied. 5.10 onwards already have this fix ]
> 
> In unuse_pte_range() we blindly swap-in pages without checking if the
> swap entry is already present in the swap cache.
> 
> By doing this, the hit/miss ratio used by the swap readahead heuristic
> is not properly updated and this leads to non-optimal performance during
> swapoff.
> 
> Tracing the distribution of the readahead size returned by the swap
> readahead heuristic during swapoff shows that a small readahead size is
> used most of the time as if we had only misses (this happens both with
> cluster and vma readahead), for example:
> 
> r::swapin_nr_pages(unsigned long offset):unsigned long:$retval
>         COUNT      EVENT
>         36948      $retval = 8
>         44151      $retval = 4
>         49290      $retval = 1
>         527771     $retval = 2
> 
> Checking if the swap entry is present in the swap cache, instead, allows
> to properly update the readahead statistics and the heuristic behaves in a
> better way during swapoff, selecting a bigger readahead size:
> 
> r::swapin_nr_pages(unsigned long offset):unsigned long:$retval
>         COUNT      EVENT
>         1618       $retval = 1
>         4960       $retval = 2
>         41315      $retval = 4
>         103521     $retval = 8
> 
> In terms of swapoff performance the result is the following:
> 
> Testing environment
> ===================
> 
>  - Host:
>    CPU: 1.8GHz Intel Core i7-8565U (quad-core, 8MB cache)
>    HDD: PC401 NVMe SK hynix 512GB
>    MEM: 16GB
> 
>  - Guest (kvm):
>    8GB of RAM
>    virtio block driver
>    16GB swap file on ext4 (/swapfile)
> 
> Test case
> =========
>  - allocate 85% of memory
>  - `systemctl hibernate` to force all the pages to be swapped-out to the
>    swap file
>  - resume the system
>  - measure the time that swapoff takes to complete:
>    # /usr/bin/time swapoff /swapfile
> 
> Result (swapoff time)
> ======
>                   5.6 vanilla   5.6 w/ this patch
>                   -----------   -----------------
> cluster-readahead      22.09s              12.19s
>     vma-readahead      18.20s              15.33s
> 
> Conclusion
> ==========
> 
> The specific use case this patch is addressing is to improve swapoff
> performance in cloud environments when a VM has been hibernated, resumed
> and all the memory needs to be forced back to RAM by disabling swap.
> 
> This change allows to better exploits the advantages of the readahead
> heuristic during swapoff and this improvement allows to to speed up the
> resume process of such VMs.
> 
> [andrea.righi@xxxxxxxxxxxxx: update changelog]
>   Link: http://lkml.kernel.org/r/20200418084705.GA147642@xps-13
> Signed-off-by: Andrea Righi <andrea.righi@xxxxxxxxxxxxx>
> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Reviewed-by: "Huang, Ying" <ying.huang@xxxxxxxxx>
> Cc: Minchan Kim <minchan@xxxxxxxxxx>
> Cc: Anchal Agarwal <anchalag@xxxxxxxxxx>
> Cc: Hugh Dickins <hughd@xxxxxxxxxx>
> Cc: Vineeth Remanan Pillai <vpillai@xxxxxxxxxxxxxxxx>
> Cc: Kelley Nielsen <kelleynnn@xxxxxxxxx>
> Link: http://lkml.kernel.org/r/20200416180132.GB3352@xps-13
> Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> ---

You forwarded on a backport without signing off on it yourself, sorry, I
can't take this as-is.  Please fix up and resend.

thanks,

greg k-h