Re: [PATCH RFC 0/3] optimize kswapd when it does reclaim for hugepage

Michal Hocko <mhocko@xxxxxxxxxx> · Tue, 24 Jan 2017 17:46:47 +0100

On Tue 24-01-17 15:49:01, Jia He wrote:
> If there is a server with uneven numa memory layout:
> available: 7 nodes (0-6)
> node 0 cpus: 0 1 2 3 4 5 6 7
> node 0 size: 6603 MB
> node 0 free: 91 MB
> node 1 cpus:
> node 1 size: 12527 MB
> node 1 free: 157 MB
> node 2 cpus:
> node 2 size: 15087 MB
> node 2 free: 189 MB
> node 3 cpus:
> node 3 size: 16111 MB
> node 3 free: 205 MB
> node 4 cpus: 8 9 10 11 12 13 14 15
> node 4 size: 24815 MB
> node 4 free: 310 MB
> node 5 cpus:
> node 5 size: 4095 MB
> node 5 free: 61 MB
> node 6 cpus:
> node 6 size: 22750 MB
> node 6 free: 283 MB
> node distances:
> node   0   1   2   3   4   5   6
>   0:  10  20  40  40  40  40  40
>   1:  20  10  40  40  40  40  40
>   2:  40  40  10  20  40  40  40
>   3:  40  40  20  10  40  40  40
>   4:  40  40  40  40  10  20  40
>   5:  40  40  40  40  20  10  40
>   6:  40  40  40  40  40  40  10
> 
> In this case node 5 has less memory and we will alloc the hugepages
> from these nodes one by one after we trigger 
> echo 4000 > /proc/sys/vm/nr_hugepages
> 
> Then the kswapd5 will take 100% cpu for a long time. This is a livelock
> issue in kswapd. This patch set fixes it.

It would be really helpful to describe what is the issue and whether it
is specific to the configuration above. Also a highlevel overview of the
fix and why it is the right approach would be appreciated.

> The 3rd patch improves the kswapd's bad performance significantly.

Numbers?

> Jia He (3):
>   mm/hugetlb: split alloc_fresh_huge_page_node into fast and slow path
>   mm, vmscan: limit kswapd loop if no progress is made
>   mm, vmscan: correct prepare_kswapd_sleep return value
> 
>  mm/hugetlb.c |  9 +++++++++
>  mm/vmscan.c  | 28 ++++++++++++++++++++++++----
>  2 files changed, 33 insertions(+), 4 deletions(-)
> 
> -- 
> 2.5.5
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>