Re: too big min_free_kbytes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sorry for the long delay in replying. I've been out the last week and am
not properly back until tomorrow.

On Mon, Jan 24, 2011 at 04:00:34PM +0100, Andrea Arcangeli wrote:
> eOn Mon, Jan 24, 2011 at 11:56:46AM +0800, Shaohua Li wrote:
> > Hi,
> > With transparent huge page, min_free_kbytes is set too big.
> > Before:
> > Node 0, zone    DMA32
> >   pages free     1812
> >         min      1424
> >         low      1780
> >         high     2136
> >         scanned  0
> >         spanned  519168
> >         present  511496
> > 
> > After:
> > Node 0, zone    DMA32
> >   pages free     482708
> >         min      11178
> >         low      13972
> >         high     16767
> >         scanned  0
> >         spanned  519168
> >         present  511496
> > This caused different performance problems in our test. I wonder why we
> > set the value so big.
> 
> It's to enable Mel's anti-frag that keeps pageblocks with movable and
> unmovable stuff separated, same as "hugeadm
> --set-recommended-min_free_kbytes".
> 

It's not so much "make it work" as "make it work better". The effect can
be measured by recording the mm_page_alloc_extfrag event. The more times
it occurs, the worse fragmentation can get. The event also reports
whether it is severe or not.

> Now that I checked, I'm seeing quite too much free memory with only 4G
> of ram... You can see the difference with a "cp /dev/sda /dev/null" in
> background interleaving these two commands:
> 

There is more than just min_free_kbytes happening here. The high
watermark goes to 16M-ish but the amount of free memory is *way* above
that watermark. Something is causing page reclaim to be a lot more
agressive than it should be.

Is there a difference with THP enabled and disabled but leaving
min_free_kbytes alone? My preliminary theory is that 2M pages are being
requested and kswapd is being woken up when it shouldn't
(__GFP_NO_KSWAPD not specified when it should be). Unfortunately I do
not have access to source at the moment to double check.

> echo always >/sys/kernel/mm/transparent_hugepage/enabled
> echo 1000 > /proc/sys/vm/min_free_kbytes
> 
> The setting of min_free_kbytes to 67584 leads to 716MB of memory
> free. Setting to 1000 leads to 20MB free. I'm afraid losing 716MB on a
> 4G system is way excessive regardless of THP...

Agreed.

> can't we just have a
> version of anti-frag that reserves a lot fewers pageblocks?

Anti-frag doesn't really take any additional special action due to
min_free_kbytes and it shouldn't be clearing out pageblocks
aggressively like this. I think it would also be worth checking how
often the mm_vmscan_kswapd_wake and mm_vmscan_wakeup_kswapd trace events
are triggering. If mm_vmscan_wakeup_kswapd is triggering a lot, a stack
trace of the most common triggering event might give a clue as to what
is going wrong.

> Anti-frag
> is quite important to avoid slab to fragment everything. I don't think
> we can leave it like this.
> 
> For now you can workaround with the above echo 1000 > ...
> 

Agreed. I'll try find time to investigate before the week is out but
after being offline for a week, I've a lot of catching up to do.

-- 
Mel Gorman
Linux Technology Center
IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]