Re: [PATCH v2 bpf-next 1/2] libbpf: add capability for resizing datasec maps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 10, 2023 at 3:33 PM JP Kobryn <inwardvessel@xxxxxxxxx> wrote:
>
> This patch updates bpf_map__set_value_size() so that if the given map is
> memory mapped, it will attempt to resize the mapped region. Initial
> contents of the mapped region are preserved. BTF is not required, but
> after the mapping is resized an attempt is made to adjust the associated
> BTF information if the following criteria is met:
>  - BTF info is present
>  - the map is a datasec
>  - the final variable in the datasec is an array
>
> ... the resulting BTF info will be updated so that the final array
> variable is associated with a new BTF array type sized to cover the
> requested size.
>
> Note that the initial resizing of the memory mapped region can succeed
> while the subsequent BTF adjustment can fail. In this case, BTF info is
> dropped from the map by clearing the key and value type.
>
> Signed-off-by: JP Kobryn <inwardvessel@xxxxxxxxx>
> ---

It's coming along pretty nicely, still few bugs and unnecessary
complications, but easy to fix.

Can you please also add a doc-comment in libbpf.h for
bpf_map__set_value_size() describing what this API is doing and
explicitly point out that once mmap-able BPF map is resized, all
previous pointers returned from bpf_map__initial_value() and BPF
skeleton pointers for corresponding data section will be invalidated
and have to be re-initialized. This is an important and subtle point,
best to call it out very explicitly.


>  tools/lib/bpf/libbpf.c | 158 ++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 157 insertions(+), 1 deletion(-)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 1cbacf9e71f3..50cfe2bd4ba0 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -1510,6 +1510,39 @@ static size_t bpf_map_mmap_sz(const struct bpf_map *map)
>         return map_sz;
>  }
>
> +static int bpf_map_mmap_resize(struct bpf_map *map, size_t old_sz, size_t new_sz)
> +{
> +       void *mmaped;
> +
> +       if (!map->mmaped)
> +               return -EINVAL;
> +
> +       if (old_sz == new_sz)
> +               return 0;
> +
> +       mmaped = mmap(NULL, new_sz, PROT_READ | PROT_WRITE,
> +                       MAP_SHARED | MAP_ANONYMOUS, -1, 0);
> +       if (mmaped == MAP_FAILED)
> +               return libbpf_err(-errno);

no need to use libbpf_err() here, it's internal function

> +
> +       /* copy pre-existing contents to new region,
> +        * using the minimum of old/new size
> +        */
> +       memcpy(mmaped, map->mmaped, min(old_sz, new_sz));
> +
> +       if (munmap(map->mmaped, old_sz)) {
> +               pr_warn("map '%s': failed to unmap\n", bpf_map__name(map));
> +               if (munmap(mmaped, new_sz))
> +                       pr_warn("map '%s': failed to unmap temp region\n",
> +                                       bpf_map__name(map));
> +               return libbpf_err(-errno);
> +       }

this seems a bit too paranoid. Let's just unconditionally
`munmap(map->mmaped, old_sz);` and that's it. Don't add warning and
definitely don't try to unmap newly mapped region. As is this code
could lead to double-free, effectively.

> +
> +       map->mmaped = mmaped;
> +
> +       return 0;
> +}
> +
>  static char *internal_map_name(struct bpf_object *obj, const char *real_name)
>  {
>         char map_name[BPF_OBJ_NAME_LEN], *p;
> @@ -9412,12 +9445,135 @@ __u32 bpf_map__value_size(const struct bpf_map *map)
>         return map->def.value_size;
>  }
>
> +static int map_btf_datasec_resize(struct bpf_map *map, __u32 size)
> +{
> +       int err;
> +       int i, vlen;
> +       struct btf *btf;
> +       const struct btf_type *array_type, *array_element_type;
> +       struct btf_type *datasec_type, *var_type;
> +       struct btf_var_secinfo *var;
> +       const struct btf_array *array;
> +       __u32 offset, nr_elements, new_array_id;
> +
> +       /* check btf existence */
> +       btf = bpf_object__btf(map->obj);
> +       if (!btf)
> +               return -ENOENT;
> +
> +       /* verify map is datasec */
> +       datasec_type = btf_type_by_id(btf, bpf_map__btf_value_type_id(map));
> +       if (!btf_is_datasec(datasec_type)) {
> +               pr_warn("map '%s': attempted to resize but map is not a datasec\n",

"map value type is not a datasec". map is not a datasec, it's value type is

> +                               bpf_map__name(map));
> +               return -EINVAL;
> +       }
> +
> +       /* verify datasec has at least one var */
> +       vlen = btf_vlen(datasec_type);
> +       if (vlen == 0) {
> +               pr_warn("map '%s': attempted to resize but map vlen == 0\n",

maybe "map value datasec is empty"? "vlen" is not necessarily
something that users will easily recognize and understand

> +                               bpf_map__name(map));
> +               return -EINVAL;
> +       }
> +
> +       /* walk to the last var in the datasec,
> +        * increasing the offset as we pass each var
> +        */
> +       var = btf_var_secinfos(datasec_type);
> +       offset = 0;
> +       for (i = 0; i < vlen - 1; i++) {
> +               offset += var->size;
> +               var++;
> +       }

it's both incorrect and overcomplicated. Just:

var = btf_var_secinfos(datasec_type)[vlen - 1];
offset = var->offset;

> +
> +       /* verify last var in the datasec is an array */
> +       var_type = btf_type_by_id(btf, var->type);
> +       array_type = skip_mods_and_typedefs(btf, var_type->type, NULL);
> +       if (!btf_is_array(array_type)) {
> +               pr_warn("map '%s': cannot be resized last var must be array\n",

"cannot be resized, last var must be an array"?

> +                               bpf_map__name(map));
> +               return -EINVAL;
> +       }
> +
> +       /* verify request size aligns with array */
> +       array = btf_array(array_type);
> +       array_element_type = btf_type_by_id(btf, array->type);

not enough, need to skip_mods_and_typedefs() first, etc. But probably
simpler to just use btf__resolve_size(array->type)? And don't forget
to check that we get > 0 result, otherwise we run a risk of division
by zero below

> +       if ((size - offset) % array_element_type->size != 0) {
> +               pr_warn("map '%s': attempted to resize but requested size does not align\n",
> +                               bpf_map__name(map));
> +               return -EINVAL;
> +       }
> +
> +       /* create a new array based on the existing array,
> +        * but with new length
> +        */
> +       nr_elements = (size - offset) / array_element_type->size;
> +       new_array_id = btf__add_array(btf, array->index_type, array->type,
> +                       nr_elements);
> +       if (new_array_id < 0) {
> +               pr_warn("map '%s': failed to create new array\n",

this is a very unlikely error to happen, unless there is some bug (the
only legitimate reason is -ENOMEM, which we generally don't log
everywhere). So let's drop unnecessary pr_warn() here.

> +                               bpf_map__name(map));
> +               err = new_array_id;
> +               return err;
> +       }
> +
> +       /* adding a new btf type invalidates existing pointers to btf objects,
> +        * so refresh pointers before proceeding
> +        */
> +       datasec_type = btf_type_by_id(btf, map->btf_value_type_id);
> +       var = btf_var_secinfos(datasec_type);
> +       for (i = 0; i < vlen - 1; i++)
> +               var++;

as I showed above, btf_var_secinfos(datasec_type)[vlen - 1], no need
for linear search

> +       var_type = btf_type_by_id(btf, var->type);
> +
> +       /* finally update btf info */
> +       datasec_type->size = size;
> +       var->size = size - offset;
> +       var_type->type = new_array_id;
> +
> +       return 0;
> +}
> +
>  int bpf_map__set_value_size(struct bpf_map *map, __u32 size)
>  {
> +       int err;
> +       __u32 old_size;
> +
>         if (map->fd >= 0)
>                 return libbpf_err(-EBUSY);
> -       map->def.value_size = size;
> +
> +       old_size = map->def.value_size;
> +
> +       if (map->mmaped) {
> +               size_t mmap_old_sz, mmap_new_sz;
> +
> +               mmap_old_sz = bpf_map_mmap_sz(map);
> +               map->def.value_size = size;
> +               mmap_new_sz = bpf_map_mmap_sz(map);

it's ugly that we need to modify map->def.value_size just to calculate
mmap region size. Let's add a helper that would take value_size and
max_entries explicitly and return mmap size? This will make this
function simpler as well as map->def.value_size will be updated once
at the very end, regardless of map->mmaped or not.

> +
> +               err = bpf_map_mmap_resize(map, mmap_old_sz, mmap_new_sz);
> +               if (err) {
> +                       pr_warn("map '%s': failed to resize memory mapped region\n",
> +                                       bpf_map__name(map));
> +                       goto err_out;
> +               }
> +               err = map_btf_datasec_resize(map, size);
> +               if (err && err != -ENOENT) {
> +                       pr_warn("map '%s': failed to adjust btf for resized map. dropping btf info\n",

nit, either capitalize "Dropping", or (preferably) turn it into a
single sentence: "failed to adjust BTF for resized map, clearing BTF
key/value type info\n". Dropping is ambiguous and sounds more
dangerous than it really is


> +                                       bpf_map__name(map));
> +                       map->btf_value_type_id = 0;
> +                       map->btf_key_type_id = 0;
> +               }
> +       } else {
> +               map->def.value_size = size;
> +       }
> +
>         return 0;
> +
> +err_out:
> +       map->def.value_size = old_size;
> +       return libbpf_err(err);
>  }
>
>  __u32 bpf_map__btf_key_type_id(const struct bpf_map *map)
> --
> 2.40.0
>





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux