Re: Silent hang up caused by pages being not scanned?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue 13-10-15 00:25:53, Tetsuo Handa wrote:
[...]
> What is strange, the values printed by this debug printk() patch did not
> change as time went by. Thus, I think that this is not a problem of lack of
> CPU time for scanning pages. I suspect that there is a bug that nobody is
> scanning pages.
> 
> ----------
> [   66.821450] zone_reclaimable returned 1 at line 2646
> [   66.823020] (ACTIVE_FILE=26+INACTIVE_FILE=10) * 6 > PAGES_SCANNED=32
> [   66.824935] shrink_zones returned 1 at line 2706
> [   66.826392] zones_reclaimable=1 at line 2765
> [   66.827865] do_try_to_free_pages returned 1 at line 2938
> [   67.102322] __perform_reclaim returned 1 at line 2854
> [   67.103968] did_some_progress=1 at line 3301
> (...snipped...)
> [  281.439977] zone_reclaimable returned 1 at line 2646
> [  281.439977] (ACTIVE_FILE=26+INACTIVE_FILE=10) * 6 > PAGES_SCANNED=32
> [  281.439978] shrink_zones returned 1 at line 2706
> [  281.439978] zones_reclaimable=1 at line 2765
> [  281.439979] do_try_to_free_pages returned 1 at line 2938
> [  281.439979] __perform_reclaim returned 1 at line 2854
> [  281.439980] did_some_progress=1 at line 3301

This is really interesting because even with reclaimable LRUs this low
we should eventually scan them enough times to convince zone_reclaimable
to fail. PAGES_SCANNED in your logs seems to be constant, though, which
suggests somebody manages to free a page every time before we get down
to priority 0 and manage to scan something finally. This is pretty much
pathological behavior and I have hard time to imagine how would that be
possible but it clearly shows that zone_reclaimable heuristic is not
working properly.

I can see two options here. Either we teach zone_reclaimable to be less
fragile or remove zone_reclaimable from shrink_zones altogether. Both of
them are risky because we have a long history of changes in this areas
which made other subtle behavior changes but I guess that the first
option should be less fragile. What about the following patch? I am not
happy about it because the condition is rather rough and a deeper
inspection is really needed to check all the call sites but it should be
good for testing.
--- 
>From afe1c5ef4726b78f51e850ed93564b52f3c73905 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@xxxxxxxx>
Date: Tue, 13 Oct 2015 15:12:13 +0200
Subject: [PATCH] mm, vmscan: Make zone_reclaimable less fragile

zone_reclaimable considers a zone unreclaimable if we have scanned all
the reclaimable pages sufficient times since the last page has been
freed and that still hasn't led to an allocation success. This can,
however, lead to a livelock/trashing when a single freed page resets
PAGES_SCANNED while memory consumers are looping over small LRUs without
making any progress (e.g. remaining pages on the LRU are dirty and all
the flushers are blocked) and failing to invoke the OOM killer beause
zone_reclaimable would consider the zone reclaimable.

Tetsuo Handa has reported the following:
: [   66.821450] zone_reclaimable returned 1 at line 2646
: [   66.823020] (ACTIVE_FILE=26+INACTIVE_FILE=10) * 6 > PAGES_SCANNED=32
: [   66.824935] shrink_zones returned 1 at line 2706
: [   66.826392] zones_reclaimable=1 at line 2765
: [   66.827865] do_try_to_free_pages returned 1 at line 2938
: [   67.102322] __perform_reclaim returned 1 at line 2854
: [   67.103968] did_some_progress=1 at line 3301
: (...snipped...)
: [  281.439977] zone_reclaimable returned 1 at line 2646
: [  281.439977] (ACTIVE_FILE=26+INACTIVE_FILE=10) * 6 > PAGES_SCANNED=32
: [  281.439978] shrink_zones returned 1 at line 2706
: [  281.439978] zones_reclaimable=1 at line 2765
: [  281.439979] do_try_to_free_pages returned 1 at line 2938
: [  281.439979] __perform_reclaim returned 1 at line 2854
: [  281.439980] did_some_progress=1 at line 3301

In his case anon LRUs are not reclaimable because there is no swap enabled.

It is not clear who frees a page that regularly but it is clear that no
progress can be made but zone_reclaimable still consider the zone
reclaimable.

This patch makes zone_reclaimable less fragile by checking the number of
reclaimable pages against low watermark. It doesn't make much sense to
rely on a PAGES_SCANNED heuristic if there are not enough reclaimable
pages to get us over min watermark.

Reported-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
---
 mm/vmscan.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index c88d74ad9304..f16266e0af70 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -209,8 +209,14 @@ static unsigned long zone_reclaimable_pages(struct zone *zone)
 
 bool zone_reclaimable(struct zone *zone)
 {
+	unsigned long reclaimable = zone_reclaimable_pages(zone);
+	unsigned long free = zone_page_state(zone, NR_FREE_PAGES);
+
+	if (reclaimable + free < min_wmark_pages(zone))
+		return false;
+
 	return zone_page_state(zone, NR_PAGES_SCANNED) <
-		zone_reclaimable_pages(zone) * 6;
+		reclaimable * 6;
 }
 
 static unsigned long get_lru_size(struct lruvec *lruvec, enum lru_list lru)
-- 
2.5.1

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]