[PATCH v10] mm: vmscan: try to reclaim swapcache pages if no swap space

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



When spaces of swap devices are exhausted, only file pages can be
reclaimed.  But there are still some swapcache pages in anon lru list.
This can lead to a premature out-of-memory.

The problem is found with such step:

 Firstly, set a 9MB disk swap space, then create a cgroup with 10MB
 memory limit, then runs an program to allocates about 15MB memory.

The problem occurs occasionally, which may need about 100 times [1].

Fix it by checking number of swapcache pages in can_reclaim_anon_pages().
If the number is not zero, return true and set swapcache_only to 1.
When scan anon lru list in swapcache_only mode, non-swapcache pages will
be skipped to isolate in order to accelerate reclaim efficiency.

However, in swapcache_only mode, the scan count still increased when scan
non-swapcache pages because there are large number of non-swapcache pages
and rare swapcache pages in swapcache_only mode, and if the non-swapcache
is skipped and do not count, the scan of pages in isolate_lru_folios() can
eventually lead to hung task, just as Sachin reported [2].

By the way, since there are enough times of memory reclaim before OOM, it
is not need to isolate too much swapcache pages in one times.

[1]. https://lore.kernel.org/lkml/CAJD7tkZAfgncV+KbKr36=eDzMnT=9dZOT0dpMWcurHLr6Do+GA@xxxxxxxxxxxxxx/
[2]. https://lore.kernel.org/linux-mm/CAJD7tkafz_2XAuqE8tGLPEcpLngewhUo=5US14PAtSM9tLBUQg@xxxxxxxxxxxxxx/

Signed-off-by: Liu Shixin <liushixin2@xxxxxxxxxx>
Tested-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
Reviewed-by: "Huang, Ying" <ying.huang@xxxxxxxxx>
Reviewed-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
---
v9->v10: Use per-node swapcache suggested by Yu Zhao.
v8->v9: Move the swapcache check after can_demote() and refector 
	can_reclaim_anon_pages() a bit.
v7->v8: Reset swapcache_only at the beginning of can_reclaim_anon_pages().
v6->v7: Reset swapcache_only to zero after there are swap spaces.
v5->v6: Fix NULL pointing derefence and hung task problem reported by Sachin.

 mm/vmscan.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 49 insertions(+), 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 506f8220c5fe..1fcc94717370 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -136,6 +136,9 @@ struct scan_control {
 	/* Always discard instead of demoting to lower tier memory */
 	unsigned int no_demotion:1;
 
+	/* Swap space is exhausted, only reclaim swapcache for anon LRU */
+	unsigned int swapcache_only:1;
+
 	/* Allocation order */
 	s8 order;
 
@@ -308,10 +311,36 @@ static bool can_demote(int nid, struct scan_control *sc)
 	return true;
 }
 
+#ifdef CONFIG_SWAP
+static bool can_reclaim_swapcache(struct mem_cgroup *memcg, int nid)
+{
+	struct pglist_data *pgdat = NODE_DATA(nid);
+	unsigned long nr_swapcache;
+
+	if (!memcg) {
+		nr_swapcache = node_page_state(pgdat, NR_SWAPCACHE);
+	} else {
+		struct lruvec *lruvec = mem_cgroup_lruvec(memcg, pgdat);
+
+		nr_swapcache = lruvec_page_state_local(lruvec, NR_SWAPCACHE);
+	}
+
+	return nr_swapcache > 0;
+}
+#else
+static bool can_reclaim_swapcache(struct mem_cgroup *memcg, int nid)
+{
+	return false;
+}
+#endif
+
 static inline bool can_reclaim_anon_pages(struct mem_cgroup *memcg,
 					  int nid,
 					  struct scan_control *sc)
 {
+	if (sc)
+		sc->swapcache_only = 0;
+
 	if (memcg == NULL) {
 		/*
 		 * For non-memcg reclaim, is there
@@ -330,7 +359,17 @@ static inline bool can_reclaim_anon_pages(struct mem_cgroup *memcg,
 	 *
 	 * Can it be reclaimed from this node via demotion?
 	 */
-	return can_demote(nid, sc);
+	if (can_demote(nid, sc))
+		return true;
+
+	/* Is there any swapcache pages to reclaim in this node? */
+	if (can_reclaim_swapcache(memcg, nid)) {
+		if (sc)
+			sc->swapcache_only = 1;
+		return true;
+	}
+
+	return false;
 }
 
 /*
@@ -1642,6 +1681,15 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
 		 */
 		scan += nr_pages;
 
+		/*
+		 * Count non-swapcache too because the swapcache pages may
+		 * be rare and it takes too much times here if not count
+		 * the non-swapcache pages.
+		 */
+		if (unlikely(sc->swapcache_only && !is_file_lru(lru) &&
+		    !folio_test_swapcache(folio)))
+			goto move;
+
 		if (!folio_test_lru(folio))
 			goto move;
 		if (!sc->may_unmap && folio_mapped(folio))
-- 
2.25.1





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux