Re: [PATCH] bpf: Try harder when allocating memory for maps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri 08-03-19 09:08:57, Martynas Pumputis wrote:
> It has been observed that sometimes memory allocation for BPF maps
> fails when there is no obvious memory pressure in a system.
> 
> E.g. the map (BPF_MAP_TYPE_LRU_HASH, key=38, value=56, max_elems=524288)
> could not be created due to due to vmalloc unable to allocate 75497472B,
> when the system's memory consumption (in MB) was the following:
> 
>     Total: 3942 Used: 837 (21.24%) Free: 138 Buffers: 239 Cached: 2727

Hmm 75MB is quite large and much larger than the slab/page allocator
cann provide so this is not really a fragmentation issue. Vmalloc does
respect noretry but considering that there shouldn't be a large memory
pressure I wonder how NORETRY managed to fail the allocation. Do you
happen to have the allocation failure report?

Btw. is there any real reason to opencode and duplicate kvmalloc logic
here? In other words why not simply make bpf_map_area_alloc use
kvmalloc_node with GFP_KERNEL?

> Considering dcda9b0471 ("mm, tree wide: replace __GFP_REPEAT by
> __GFP_RETRY_MAYFAIL with more useful semantic") we can replace
> __GFP_NORETRY with __GFP_RETRY_MAYFAIL, as it won't invoke OOM killer
> and will try harder to fulfil allocation requests.
> 
> The change has been tested with the workloads mentioned above and by
> observing oom_kill value from /proc/vmstat.
> 
> Signed-off-by: Martynas Pumputis <m@xxxxxxxxx>
> ---
>  kernel/bpf/syscall.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 62f6bced3a3c..eb5cefe44af3 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -136,11 +136,11 @@ static struct bpf_map *find_and_alloc_map(union bpf_attr *attr)
>  
>  void *bpf_map_area_alloc(size_t size, int numa_node)
>  {
> -	/* We definitely need __GFP_NORETRY, so OOM killer doesn't
> -	 * trigger under memory pressure as we really just want to
> -	 * fail instead.
> +	/* We definitely need __GFP_NORETRY or __GFP_RETRY_MAYFAIL, so
> +	 * OOM killer doesn't trigger under memory pressure as we really
> +	 * just want to fail instead.
>  	 */
> -	const gfp_t flags = __GFP_NOWARN | __GFP_NORETRY | __GFP_ZERO;
> +	const gfp_t flags = __GFP_NOWARN | __GFP_RETRY_MAYFAIL | __GFP_ZERO;
>  	void *area;
>  
>  	if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
> -- 
> 2.21.0
> 

-- 
Michal Hocko
SUSE Labs



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux