Re: [PATCH v2] mm: shmem: skip swapcache for swapin of synchronous swap device

Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> · Mon, 6 Jan 2025 14:29:45 +0800

On 2025/1/6 12:59, Baolin Wang wrote:


On 2025/1/6 12:07, Matthew Wilcox wrote:
On Mon, Jan 06, 2025 at 11:46:04AM +0800, Baolin Wang wrote:
On 2025/1/2 21:10, Matthew Wilcox wrote:
On Thu, Jan 02, 2025 at 04:40:17PM +0800, Baolin Wang wrote:
With fast swap devices (such as zram), swapin latency is crucial to 
applications.
For shmem swapin, similar to anonymous memory swapin, we can skip 
the swapcache
operation to improve swapin latency.

OK, but now we have more complexity.  Why can't we always skip the
swapcache on swapin?

Skipping swapcache is used to swap-in shmem large folios, avoiding 
the large
folios being split. Meanwhile, since the IO latency of syncing swap 
devices
is relatively small, it won't cause the IO latency amplification issue.

But for async swap devices, if we swap-in the large folio one-time, I am
afraid the IO latency can be amplified. And I remember we still haven't
reached an agreement here[1], so let's step by step and start with 
the sync
swap devices first.

Regardless of whether we choose to swap-in an order-0 or a large folio,
my point is that we should always do it to the pagecache rather than the
swap cache.

IMO, this would miss the swap readahead algorithm in the swap case, 
which can benefit the order-0 swap-in. We need more work to ensure that 
skipping swapcache is helpful for all cases, which is why I'm starting 
with sync swap devices first.

BTW, I used the SSD swap device to test the performance of skipping 
swapcache with the following hack changes, and I found that the 
performance of order-0 sequential swap-in shows a significant regression.

Without the following changes:
1G order-0 shmem swap-in: 8056 ms

With the following changes (skip swapcache):
1G order-0 shmem swap-in: 38536 ms

diff --git a/mm/page_io.c b/mm/page_io.c
index 9b983de351f9..1e22dedcd584 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -620,7 +620,6 @@ void swap_read_folio(struct folio *folio, struct 
swap_iocb **plug)
        unsigned long pflags;
        bool in_thrashing;

-       VM_BUG_ON_FOLIO(!folio_test_swapcache(folio) && !synchronous, 
folio);
        VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
        VM_BUG_ON_FOLIO(folio_test_uptodate(folio), folio);

diff --git a/mm/shmem.c b/mm/shmem.c
index e82ef1ef1c68..2902d3477520 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2295,7 +2295,7 @@ static int shmem_swapin_folio(struct inode *inode, 
pgoff_t index,
                        fallback_order0 = true;

                /* Skip swapcache for synchronous device. */
-               if (!fallback_order0 && data_race(si->flags & 
SWP_SYNCHRONOUS_IO)) {
+               if (!fallback_order0) {
                        folio = shmem_swap_alloc_folio(inode, vma, 
index, swap, order, gfp);
                        if (!IS_ERR(folio)) {
                                skip_swapcache = true;