On Mon, Jul 5, 2021 at 12:08 PM Martynas Pumputis <m@xxxxxxxxx> wrote: > > When loading in parallel multiple programs which use the same to-be > pinned map, it is possible that two instances of the loader will call > bpf_object__create_maps() at the same time. If the map doesn't exist > when both instances call bpf_object__reuse_map(), then one of the > instances will fail with EEXIST when calling bpf_map__pin(). > > Fix the race by retrying creating a map if bpf_map__pin() returns > EEXIST. The fix is similar to the one in iproute2: e4c4685fd6e4 ("bpf: > Fix race condition with map pinning"). > > Cc: Joe Stringer <joe@xxxxxxxxxxx> > Signed-off-by: Martynas Pumputis <m@xxxxxxxxx> > --- > tools/lib/bpf/libbpf.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c > index 1e04ce724240..7a31c7c3cd21 100644 > --- a/tools/lib/bpf/libbpf.c > +++ b/tools/lib/bpf/libbpf.c > @@ -4616,10 +4616,12 @@ bpf_object__create_maps(struct bpf_object *obj) > char *cp, errmsg[STRERR_BUFSIZE]; > unsigned int i, j; > int err; > + bool retried = false; retried has to be reset for each map, so just move it inside the for loop? you can also generalize it to retry_cnt (> 1 attempts) to allow for more extreme cases of multiple loaders fighting very heavily > > for (i = 0; i < obj->nr_maps; i++) { > map = &obj->maps[i]; > > +retry: > if (map->pin_path) { > err = bpf_object__reuse_map(map); > if (err) { > @@ -4660,9 +4662,13 @@ bpf_object__create_maps(struct bpf_object *obj) > if (map->pin_path && !map->pinned) { > err = bpf_map__pin(map, NULL); > if (err) { > + zclose(map->fd); > + if (!retried && err == EEXIST) { so I'm also wondering... should we commit at this point to trying to pin and not attempt to re-create the map? I'm worried that bpf_object__create_map() is not designed and tested to be called multiple times for the same bpf_map, but it's technically possible for it to be called multiple times in this scenario. Check the inner map creation scenario, for example (btw, I think there is a bug in bpf_object__create_map clean up for inner map, care to take a look at that as well?). So unless we want to allow map re-creation if (in a highly unlikely scenario) someone already unpinned the other instance, I'd say we should just bpf_map__pin() here directly, maybe in a short loop to allow for a few attempts. > + retried = true; > + goto retry; > + } > pr_warn("map '%s': failed to auto-pin at '%s': %d\n", > map->name, map->pin_path, err); > - zclose(map->fd); > goto err_out; > } > } > -- > 2.32.0 >