On 10/21/22 12:12 AM, David Vernet wrote:
On Thu, Oct 20, 2022 at 03:13:27PM -0700, Yonghong Song wrote:
Add some descriptions and examples for BPF_MAP_TYPE_CGRP_STROAGE.
Also illustate the major difference between BPF_MAP_TYPE_CGRP_STROAGE
and BPF_MAP_TYPE_CGROUP_STORAGE and recommend to use
BPF_MAP_TYPE_CGRP_STROAGE instead of BPF_MAP_TYPE_CGROUP_STORAGE
in the end.
Signed-off-by: Yonghong Song <yhs@xxxxxx>
---
Documentation/bpf/map_cgrp_storage.rst | 104 +++++++++++++++++++++++++
1 file changed, 104 insertions(+)
create mode 100644 Documentation/bpf/map_cgrp_storage.rst
diff --git a/Documentation/bpf/map_cgrp_storage.rst b/Documentation/bpf/map_cgrp_storage.rst
new file mode 100644
index 000000000000..15691aab7fc7
--- /dev/null
+++ b/Documentation/bpf/map_cgrp_storage.rst
@@ -0,0 +1,104 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+.. Copyright (C) 2022 Meta Platforms, Inc. and affiliates.
+
+===========================
+BPF_MAP_TYPE_CGRP_STORAGE
+===========================
+
+The ``BPF_MAP_TYPE_CGRP_STORAGE`` map type represents a local fix-sized
+storage for cgroups. It is only available with ``CONFIG_CGROUP_BPF``.
+The programs are made available by the same Kconfig. The
+data for a particular cgroup can be retrieved by looking up the map
+with that cgroup.
+
+This document describes the usage and semantics of the
+``BPF_MAP_TYPE_CGRP_STORAGE`` map type.
+
+Usage
+=====
+
+The map key must be ``sizeof(int)`` representing a cgroup fd.
+To access the storage in a program, use ``bpf_cgrp_storage_get``::
+
+ void *bpf_cgrp_storage_get(struct bpf_map *map, struct cgroup *cgroup, void *value, u64 flags)
+
+``flags`` could be 0 or ``BPF_LOCAL_STORAGE_GET_F_CREATE`` which indicates that
+a new local storage will be created if one does not exist.
+
+The local storage can be removed with ``bpf_cgrp_storage_delete``::
+
+ long bpf_cgrp_storage_delete(struct bpf_map *map, struct cgruop *cgroup)
s/cgruop/cgroup
ack.
+
+The map is available to all program types.
+
+Examples
+========
+
+An bpf-program example with BPF_MAP_TYPE_CGRP_STORAGE::
s/An bpf-program/A bpf program
ack.
+
+ #include <vmlinux.h>
+ #include <bpf/bpf_helpers.h>
+ #include <bpf/bpf_tracing.h>
+
+ struct {
+ __uint(type, BPF_MAP_TYPE_CGRP_STORAGE);
+ __uint(map_flags, BPF_F_NO_PREALLOC);
+ __type(key, int);
+ __type(value, long);
+ } cgrp_storage SEC(".maps");
+
+ SEC("tp_btf/sys_enter")
+ int BPF_PROG(on_enter, struct pt_regs *regs, long id)
+ {
+ struct task_struct *task = bpf_get_current_task_btf();
+ long *ptr;
+
+ ptr = bpf_cgrp_storage_get(&cgrp_storage, task->cgroups->dfl_cgrp, 0,
+ BPF_LOCAL_STORAGE_GET_F_CREATE);
+ if (ptr)
+ __sync_fetch_and_add(ptr, 1);
+
+ return 0;
+ }
+
+Userspace accessing map declared above::
+
+ #include <linux/bpf.h>
+ #include <linux/libbpf.h>
+
+ __u32 map_lookup(struct bpf_map *map, int cgrp_fd)
+ {
+ __u32 *value;
+ value = bpf_map_lookup_elem(bpf_map__fd(map), &cgrp_fd);
+ if (value)
+ return *value;
+ return 0;
+ }
+
+Difference Between BPF_MAP_TYPE_CGRP_STORAGE and BPF_MAP_TYPE_CGROUP_STORAGE
+============================================================================
+
+The main difference between ``BPF_MAP_TYPE_CGRP_STORAGE`` and ``BPF_MAP_TYPE_CGROUP_STORAGE``
+is that ``BPF_MAP_TYPE_CGRP_STORAGE`` can be used by all program types while
+``BPF_MAP_TYPE_CGROUP_STORAGE`` is available only to cgroup program types.
Suggestion: I'd consider going into just a bit more detail about what's
meant by "cgroup program types" here. Maybe just 1-2 sentences
describing the types of programs where the deprecated cgroup storage map
type is available, and why. A program that's using the
BPF_MAP_TYPE_CGRP_STORAGE map is conceptually also a "cgroup program
type" in that it's querying, analyzing, etc cgroups, so being explicit
may help get ahead of any confusion.
Right, 'cgroup program types' is not precise either. It should 'only
available to programs attached to cgroups'. I will add a few examples
like BPF_CGROUP_INET_INGRESS and BPF_CGROUP_SOCK_OPS, etc.
Also, given that BPF_MAP_TYPE_CGROUP_STORAGE is deprecated, and is
extremely close in name to BPF_MAP_TYPE_CGRP_STORAGE, perhaps we should
start this section out by explaining that BPF_MAP_TYPE_CGROUP_STORAGE is
a deprecated map type that's now aliased to
BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED, and then reference it as
BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED for the rest of the page. I think
it's important to highlight that the map type is deprecated, and give
some historical context here so that users understand why they should
use BPF_MAP_TYPE_CGRP_STORAGE, and why we have two such confusingly-
similarly named maps. What do you think?
Make sense. Putting deprecation in the beginning of this section
can make readers more aware of the difference from the beginning.
+There are some other differences as well.
+
+(1). ``BPF_MAP_TYPE_CGRP_STORAGE`` supports local storage for more than one
+ cgroups while ``BPF_MAP_TYPE_CGROUP_STORAGE`` only support one, the one attached
s/cgroups/cgroup
ack.
+ by the bpf program.
+
+(2). ``BPF_MAP_TYPE_CGROUP_STORAGE`` allocates local storage at attach time so
+ ``bpf_get_local_storage()`` always returns non-null local storage.
Suggestion: s/non-null/non-NULL. In general, I'd suggest changing null
to NULL throughout the page, but I don't feel strongly about it.
ack.
+ ``BPF_MAP_TYPE_CGRP_STORAGE`` allocates local storage at runtime so
+ it is possible that ``bpf_cgrp_storage_get()`` may return null local storage.
+ To avoid such null local storage issue, user space can do
+ ``bpf_map_update_elem()`` to pre-allocate local storage.
Should we specify that user space can preallocate by doing
bpf_map_update_elem() _before_ the program is attached? Also, another
small nit, but I think pre-allocate and de-allocate should just be
preallocate and deallocate respectively.
Yes, bpf_map_update_elem() should be done before attachment. I will add
this.
+(3). ``BPF_MAP_TYPE_CGRP_STORAGE`` supports de-allocating local storage by bpf program
s/by bpf program/by a bpf program
ack.
+ while ``BPF_MAP_TYPE_CGROUP_STORAGE`` only de-allocates storage during
+ prog detach time.
+
+So overall, ``BPF_MAP_TYPE_CGRP_STORAGE`` supports all ``BPF_MAP_TYPE_CGROUP_STORAGE``
+functionality and beyond. It is recommended to use ``BPF_MAP_TYPE_CGRP_STORAGE``
+instead of ``BPF_MAP_TYPE_CGROUP_STORAGE``.
As mentioned above, I think we need to go beyond stating that using
BPF_MAP_TYPE_CGRP_STORAGE is recommended, and also explicitly and loudly
call out that BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED is deprecated and
will not be supported indefinitely.
agree.
Overall though, this looks great. Thank you for writing it up.
Thanks,
David