On 2024/8/8 16:51, David Hildenbrand wrote:
On 08.08.24 04:36, Baolin Wang wrote:
On 2024/8/7 23:53, David Hildenbrand wrote:
On 07.08.24 09:31, Baolin Wang wrote:
Page reclaim will not scan anon LRU if no swap space, however
MADV_PAGEOUT
can still split shmem large folios even without a swap device. Thus add
swap available space validation before spliting shmem large folio to
avoid redundant split.
Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
---
mm/vmscan.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 31d13462571e..796f65781f4f 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1259,6 +1259,14 @@ static unsigned int shrink_folio_list(struct
list_head *folio_list,
}
} else if (folio_test_swapbacked(folio) &&
folio_test_large(folio)) {
+
+ /*
+ * Do not split shmem folio if no swap memory
+ * available.
+ */
+ if (!total_swap_pages)
+ goto activate_locked;
+
/* Split shmem folio */
if (split_folio_to_list(folio, folio_list))
goto keep_locked;
Reminds me of
commit 9a976f0c847b67d22ed694556a3626ed92da0422
Author: Luis Chamberlain <mcgrof@xxxxxxxxxx>
Date: Thu Mar 9 15:05:43 2023 -0800
shmem: skip page split if we're not reclaiming
In theory when info->flags & VM_LOCKED we should not be getting
shem_writepage() called so we should be verifying this with a
WARN_ON_ONCE(). Since we should not be swapping then best to
ensure we
also don't do the folio split earlier too. So just move the check
early
to avoid folio splits in case its a dubious call.
We also have a similar early bail when !total_swap_pages so just
move that
earlier to avoid the possible folio split in the same situation.
But indeed, pageout() -> writepage() is called *after* the split in the
vmscan path.
In that "noswap" context, I wonder if we also want to skip folios part
of shmem
with disabled swapping?
Yes, I think so.
But now I am wondering under which circumstances we end up calling
shmem_writepage() with a large folio. And I think the answer is the
comment of
folio_test_large(): via drivers/gpu/drm/i915/gem/i915_gem_shmem.c.
... so if shmem_writepage() handles+checks that, could we do
diff --git a/mm/vmscan.c b/mm/vmscan.c
index a332cb80e928..7dfa3d6e8ba7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1257,11 +1257,6 @@ static unsigned int shrink_folio_list(struct
list_head *folio_list,
goto
activate_locked_split;
}
}
- } else if (folio_test_swapbacked(folio) &&
- folio_test_large(folio)) {
- /* Split shmem folio */
- if (split_folio_to_list(folio, folio_list))
- goto keep_locked;
}
/*
instead?
Seems reasonable to me. But we should pass the 'folio_list' to
shmem_writepage() to list the subpages of the large folio. Let me try.
Ah, yes, good point. Alternatively, we'd have to split and try writing
all subpages in there. I wonder what to do if we fail to write some, and
if we could handle that transparently, without the folio_list.
After some investigation, I prefer to pass 'folio_list' to
shmem_writepage() via 'struct writeback_control', which could simplify
the logic a lot. Otherwise, we need to handle each subpage's
writeback/reclaim/dirty state, as well as tracking each subpage's write
result, which seems more complicated.
I made a quick change by passing 'folio_list', and it looks simple and
works as expected.
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index 75196b0f894f..10100e22d5c6 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -80,6 +80,9 @@ struct writeback_control {
*/
struct swap_iocb **swap_plug;
+ /* Target list for splitting a large folio */
+ struct list_head *list;
+
/* internal fields used by the ->writepages implementation: */
struct folio_batch fbatch;
pgoff_t index;
diff --git a/mm/shmem.c b/mm/shmem.c
index 05525e9e7423..0a5a68f7d0a0 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1496,9 +1496,10 @@ static int shmem_writepage(struct page *page,
struct writeback_control *wbc)
* and its shmem_writeback() needs them to be split when swapping.
*/
if (wbc->split_large_folio && folio_test_large(folio)) {
+try_split:
/* Ensure the subpages are still dirty */
folio_test_set_dirty(folio);
- if (split_huge_page(page) < 0)
+ if (split_huge_page_to_list_to_order(page, wbc->list, 0))
goto redirty;
folio = page_folio(page);
folio_clear_dirty(folio);
@@ -1540,8 +1541,12 @@ static int shmem_writepage(struct page *page,
struct writeback_control *wbc)
}
swap = folio_alloc_swap(folio);
- if (!swap.val)
+ if (!swap.val) {
+ if (nr_pages > 1)
+ goto try_split;
+
goto redirty;
+ }
/*
* Add inode to shmem_unuse()'s list of swapped-out inodes,
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 277571815789..cf982cf2454f 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -628,7 +628,7 @@ typedef enum {
* Calls ->writepage().
*/
static pageout_t pageout(struct folio *folio, struct address_space
*mapping,
- struct swap_iocb **plug)
+ struct swap_iocb **plug, struct list_head
*folio_list)
{
/*
* If the folio is dirty, only perform writeback if that write
@@ -676,6 +676,11 @@ static pageout_t pageout(struct folio *folio,
struct address_space *mapping,
.swap_plug = plug,
};
+ if (shmem_mapping(mapping)) {
+ wbc.list = folio_list;
+ wbc.split_large_folio =
!IS_ENABLED(CONFIG_THP_SWAP);
+ }
+
folio_set_reclaim(folio);
res = mapping->a_ops->writepage(&folio->page, &wbc);
if (res < 0)
@@ -1259,23 +1264,6 @@ static unsigned int shrink_folio_list(struct
list_head *folio_list,
goto activate_locked_split;
}
}
- } else if (folio_test_swapbacked(folio) &&
- folio_test_large(folio)) {
-
- /*
- * Do not split shmem folio if no swap memory
- * available.
- */
- if (!total_swap_pages)
- goto activate_locked;
-
- /*
- * Only split shmem folio when CONFIG_THP_SWAP
- * is not enabled.
- */
- if (!IS_ENABLED(CONFIG_THP_SWAP) &&
- split_folio_to_list(folio, folio_list))
- goto keep_locked;
}
/*
@@ -1377,18 +1365,21 @@ static unsigned int shrink_folio_list(struct
list_head *folio_list,
* starts and then write it out here.
*/
try_to_unmap_flush_dirty();
-try_pageout:
- switch (pageout(folio, mapping, &plug)) {
+ switch (pageout(folio, mapping, &plug,
folio_list)) {
case PAGE_KEEP:
goto keep_locked;
case PAGE_ACTIVATE:
- if (shmem_mapping(mapping) &&
folio_test_large(folio) &&
- !split_folio_to_list(folio,
folio_list)) {
+ /* Shmem can be split when writeback to
swap */
+ if ((nr_pages > 1) &&
!folio_test_large(folio)) {
+ sc->nr_scanned -= (nr_pages - 1);
nr_pages = 1;
- goto try_pageout;
}
goto activate_locked;
case PAGE_SUCCESS:
+ if ((nr_pages > 1) &&
!folio_test_large(folio)) {
+ sc->nr_scanned -= (nr_pages - 1);
+ nr_pages = 1;
+ }
stat->nr_pageout += nr_pages;
if (folio_test_writeback(folio)) {