Re: kswapd consumes 100% CPU when highest zone is small

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 3 March 2016 at 01:36, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
On Wed 02-03-16 14:20:38, Jerry Lee wrote:
> Hi,
>
> I have a x86_64 system with 2G RAM using linux-3.12.x.  During copying
> large
> files (e.g. 100GB), kswapd easily consumes 100% CPU until the file is
> deleted
> or the page cache is dropped.  With setting the min_free_kbytes from 16384
> to
> 65536, the symptom is mitigated but I can't totally get rid of the problem.
>
> After some trial and error, I found that highest zone is always unbalanced
> with
> order-0 page request so that pgdat_blanaced() continuously return false and
> kswapd can't sleep.
>
> Here's the watermarks (min_free_kbytes = 65536) in my system:
> Node 0, zone      DMA
>   pages free     2167
>         min      138
>         low      172
>         high     207
>         scanned  0
>         spanned  4095
>         present  3996
>         managed  3974
>
> Node 0, zone    DMA32
>   pages free     215375
>         min      16226
>         low      20282
>         high     24339
>         scanned  0
>         spanned  1044480
>         present  490971
>         managed  464223
>
> Node 0, zone   Normal
>   pages free     7
>         min      18
>         low      22
>         high     27
>         scanned  0
>         spanned  1536
>         present  1536
>         managed  523

The zone Normal is just too small and that confuses the reclaim path.

>
> Besides, when the kswapd crazily spins, the value of the following entries
> in vmstat increases quickly even when I stop copying file:
>
> pgalloc_dma 17719
> pgalloc_dma32 3262823
> slabs_scanned 937728
> kswapd_high_wmark_hit_quickly 54333233
> pageoutrun 54333235
>
> Is there anything I could do to totally get rid of the problem?

I would try to sacrifice those few megs and get rid of zone normal
completely. AFAIR mem=4G should limit the max_pfn to 4G so DMA32 should
cover the shole memory.

I came up with a patch that seem to work well on my system.  But, I am afraid
that it breaks the rule that all zones must be balanced for order-0 request and
It may cause some other side-effect?  I thought that the patch is just a workaround
(a bad one) and not a cure-all.

BTW, if I upgrade the RAM from 2G to 4G, the problem is gone because the
Normal zone won't confuse the reclaim path as you said before.

Thanks


--- a/linux-3.12.6/mm/vmscan.c
+++ b/linux-3.12.6/mm/vmscan.c
@@ -2755,6 +2755,7 @@ static bool pgdat_balanced(pg_data_t *pgdat, int order, int classzone_idx)
        unsigned long managed_pages = 0;
        unsigned long balanced_pages = 0;
        int i;
+#define HWMARK_THRESHOLD 128
 
        /* Check the watermark levels */
        for (i = 0; i <= classzone_idx; i++) {
@@ -2779,7 +2780,8 @@ static bool pgdat_balanced(pg_data_t *pgdat, int order, int classzone_idx)
 
                if (zone_balanced(zone, order, 0, i))
                        balanced_pages += zone->managed_pages;
-               else if (!order)
+               else if (!order &&
+                        (high_wmark_pages(zone) > HWMARK_THRESHOLD))
                        return false;
        }

 
--
Michal Hocko
SUSE Labs


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]