Re: [PATCH bpf-next v2 3/6] bpf: Allow per unit prefill for non-fix-size percpu memory allocator

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 12/14/23 11:40 PM, Hou Tao wrote:
Hi,

On 12/15/2023 3:27 PM, Yonghong Song wrote:
On 12/14/23 10:50 PM, Yonghong Song wrote:
On 12/14/23 7:19 PM, Hou Tao wrote:
On 12/15/2023 8:12 AM, Yonghong Song wrote:
Commit 41a5db8d8161 ("Add support for non-fix-size percpu mem
allocation")
added support for non-fix-size percpu memory allocation.
Such allocation will allocate percpu memory for all buckets on all
cpus and the memory consumption is in the order to quadratic.
For example, let us say, 4 cpus, unit size 16 bytes, so each
cpu has 16 * 4 = 64 bytes, with 4 cpus, total will be 64 * 4 = 256
bytes.
Then let us say, 8 cpus with the same unit size, each cpu
has 16 * 8 = 128 bytes, with 8 cpus, total will be 128 * 8 = 1024
bytes.
So if the number of cpus doubles, the number of memory consumption
will be 4 times. So for a system with large number of cpus, the
memory consumption goes up quickly with quadratic order.
For example, for 4KB percpu allocation, 128 cpus. The total memory
consumption will 4KB * 128 * 128 = 64MB. Things will become
worse if the number of cpus is bigger (e.g., 512, 1024, etc.)
SNIP
+#ifdef CONFIG_MEMCG_KMEM
+    objcg = get_obj_cgroup_from_current();
+#endif
+    for_each_possible_cpu(cpu) {
+        cc = per_cpu_ptr(pcc, cpu);
+        c = &cc->cache[i];
+        if (cpu == 0 && c->unit_size)
+            goto out;
+
+        c->unit_size = unit_size;
+        c->objcg = objcg;
+        c->percpu_size = percpu_size;
+        c->tgt = c;
+
+        init_refill_work(c);
+        prefill_mem_cache(c, cpu);
+
+        if (cpu == 0) {
+            err = check_obj_size(c, i);
+            if (err) {
+                bpf_mem_alloc_destroy_cache(c);
It seems drain_mem_cache() will be enough. Have you considered setting
At prefill stage, looks like the following is enough:
     free_all(__llist_del_all(&c->free_llist), percpu);
But I agree that drain_mem_cache() is simpler and is
easier for future potential code change.

low_watermark as 0 to prevent potential refill in unit_alloc() if the
initialization of the current unit fails ?
I think it does make sense. For non-fix-size non-percpu prefill,
if check_obj_size() failed, the prefill will fail, which include
all buckets.

In this case, if it fails for a particular bucket, we should
make sure that bucket always return NULL ptr, so setting the
low_watermark to 0 does make sense.
Thinking again. If the initialization of the current unit
failed, the verification will fail and the corresponding
bpf program will not be able to do memory alloc, so we
should be fine.

But it is totally possible that some prog later may
call bpf_mem_alloc_percpu_unit_init() again with the
same size/bucket. So we should simply reset bpf_mem_cache
to 0 during the previous failed bpf_mem_alloc_percpu_unit_init()
call. Is it possible that check_obj_size() may initially
returns an error but sometime later something in
the kernel changed and the check_obj_size() with the
same size could return true?
Resetting bpf_mem_cache as 0 is much simpler and easier to understand
than resetting low_watermark as 0. For per-cpu allocation, the return
value of pcpu_alloc_size() is stable and I don't think it will change
like ksize() does(), so it is not possible that the previous
check_obj_size() failed, but the new check_obj_size() for the same
unit_size succeeds.

Thanks for clarification. Let me just do resetting bpf_mem_cache to 0 then.



+                goto out;
+            }
+        }
+    }
+
+out:
+    return err;
+}
+
   static void check_mem_cache(struct bpf_mem_cache *c)
   {
WARN_ON_ONCE(!llist_empty(&c->free_by_rcu_ttrace));

.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux