On Mon, Jan 30, 2023 at 03:28:23PM +0000, Luiz Capitulino wrote: > From: Andrea Righi <andrea.righi@xxxxxxxxxxxxx> > > Commit ebc5951eea499314f6fbbde20e295f1345c67330 upstream. > > [ This fixes a performance issue we're seeing in AWS instances when > running swapoff and using the global readahead algorithm. For a > particular instance configuration, Without this fix I/O throughput > is very low during swapoff (about 15 MB/s) with this patch is > reaches 500 MB/s. Tested swapoff with different workloads with > this patch applied. 5.10 onwards already have this fix ] > > In unuse_pte_range() we blindly swap-in pages without checking if the > swap entry is already present in the swap cache. > > By doing this, the hit/miss ratio used by the swap readahead heuristic > is not properly updated and this leads to non-optimal performance during > swapoff. > > Tracing the distribution of the readahead size returned by the swap > readahead heuristic during swapoff shows that a small readahead size is > used most of the time as if we had only misses (this happens both with > cluster and vma readahead), for example: > > r::swapin_nr_pages(unsigned long offset):unsigned long:$retval > COUNT EVENT > 36948 $retval = 8 > 44151 $retval = 4 > 49290 $retval = 1 > 527771 $retval = 2 > > Checking if the swap entry is present in the swap cache, instead, allows > to properly update the readahead statistics and the heuristic behaves in a > better way during swapoff, selecting a bigger readahead size: > > r::swapin_nr_pages(unsigned long offset):unsigned long:$retval > COUNT EVENT > 1618 $retval = 1 > 4960 $retval = 2 > 41315 $retval = 4 > 103521 $retval = 8 > > In terms of swapoff performance the result is the following: > > Testing environment > =================== > > - Host: > CPU: 1.8GHz Intel Core i7-8565U (quad-core, 8MB cache) > HDD: PC401 NVMe SK hynix 512GB > MEM: 16GB > > - Guest (kvm): > 8GB of RAM > virtio block driver > 16GB swap file on ext4 (/swapfile) > > Test case > ========= > - allocate 85% of memory > - `systemctl hibernate` to force all the pages to be swapped-out to the > swap file > - resume the system > - measure the time that swapoff takes to complete: > # /usr/bin/time swapoff /swapfile > > Result (swapoff time) > ====== > 5.6 vanilla 5.6 w/ this patch > ----------- ----------------- > cluster-readahead 22.09s 12.19s > vma-readahead 18.20s 15.33s > > Conclusion > ========== > > The specific use case this patch is addressing is to improve swapoff > performance in cloud environments when a VM has been hibernated, resumed > and all the memory needs to be forced back to RAM by disabling swap. > > This change allows to better exploits the advantages of the readahead > heuristic during swapoff and this improvement allows to to speed up the > resume process of such VMs. > > [andrea.righi@xxxxxxxxxxxxx: update changelog] > Link: http://lkml.kernel.org/r/20200418084705.GA147642@xps-13 > Signed-off-by: Andrea Righi <andrea.righi@xxxxxxxxxxxxx> > Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Reviewed-by: "Huang, Ying" <ying.huang@xxxxxxxxx> > Cc: Minchan Kim <minchan@xxxxxxxxxx> > Cc: Anchal Agarwal <anchalag@xxxxxxxxxx> > Cc: Hugh Dickins <hughd@xxxxxxxxxx> > Cc: Vineeth Remanan Pillai <vpillai@xxxxxxxxxxxxxxxx> > Cc: Kelley Nielsen <kelleynnn@xxxxxxxxx> > Link: http://lkml.kernel.org/r/20200416180132.GB3352@xps-13 > Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > --- You forwarded on a backport without signing off on it yourself, sorry, I can't take this as-is. Please fix up and resend. thanks, greg k-h