Re: [PATCH bpf-next v4 4/7] bpf: Refill only one percpu element in memalloc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 12/18/2023 2:30 PM, Yonghong Song wrote:
> Typically for percpu map element or data structure, once allocated,
> most operations are lookup or in-place update. Deletion are really
> rare. Currently, for percpu data strcture, 4 elements will be
> refilled if the size is <= 256. Let us just do with one element
> for percpu data. For example, for size 256 and 128 cpus, the
> potential saving will be 3 * 256 * 128 * 128 = 12MB.
>
> Acked-by: Hou Tao <houtao1@xxxxxxxxxx>
> Signed-off-by: Yonghong Song <yonghong.song@xxxxxxxxx>
> ---
>  kernel/bpf/memalloc.c | 13 +++++++++----
>  1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c
> index 50ab2fecc005..f37998662146 100644
> --- a/kernel/bpf/memalloc.c
> +++ b/kernel/bpf/memalloc.c
> @@ -485,11 +485,16 @@ static void init_refill_work(struct bpf_mem_cache *c)
>  
>  static void prefill_mem_cache(struct bpf_mem_cache *c, int cpu)
>  {
> -	/* To avoid consuming memory assume that 1st run of bpf
> -	 * prog won't be doing more than 4 map_update_elem from
> -	 * irq disabled region
> +	int cnt = 1;
> +
> +	/* To avoid consuming memory, for non-percpu allocation, assume that
> +	 * 1st run of bpf prog won't be doing more than 4 map_update_elem from
> +	 * irq disabled region if unit size is less than or equal to 256.
> +	 * For all other cases, let us just do one allocation.
>  	 */
> -	alloc_bulk(c, c->unit_size <= 256 ? 4 : 1, cpu_to_node(cpu), false);
> +	if (!c->percpu_size && c->unit_size <= 256)
> +		cnt = 4;
> +	alloc_bulk(c, cnt, cpu_to_node(cpu), false);
>  }

Another thought about this patch. When the prefilled element is
allocated by the invocation of bpf_percpu_obj_new(), the prefill will
trigger again and this time it will allocate c->batch elements. For
256-bytes unit_size, c->batch will be 64, so the maximal memory
consumption under 128-cpus host will be: 64 * 256 * 128 * 128 = 256MB
when there is one allocation of bpf_percpu_obj_new() on each CPU. And my
question is that should we adjust the low_watermark and high_watermark
accordingly for per-cpu allocation to reduce the memory consumption ?
>  
>  static int check_obj_size(struct bpf_mem_cache *c, unsigned int idx)





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux