Re: [PATCH bpf] libbpf: fix race when pinning maps in parallel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 7/8/21 10:33 PM, Andrii Nakryiko wrote:
On Thu, Jul 8, 2021 at 8:50 AM Martynas Pumputis <m@xxxxxxxxx> wrote:



On 7/8/21 12:38 AM, Andrii Nakryiko wrote:
On Mon, Jul 5, 2021 at 12:08 PM Martynas Pumputis <m@xxxxxxxxx> wrote:

When loading in parallel multiple programs which use the same to-be
pinned map, it is possible that two instances of the loader will call
bpf_object__create_maps() at the same time. If the map doesn't exist
when both instances call bpf_object__reuse_map(), then one of the
instances will fail with EEXIST when calling bpf_map__pin().

Fix the race by retrying creating a map if bpf_map__pin() returns
EEXIST. The fix is similar to the one in iproute2: e4c4685fd6e4 ("bpf:
Fix race condition with map pinning").

Cc: Joe Stringer <joe@xxxxxxxxxxx>
Signed-off-by: Martynas Pumputis <m@xxxxxxxxx>
---
   tools/lib/bpf/libbpf.c | 8 +++++++-
   1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 1e04ce724240..7a31c7c3cd21 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -4616,10 +4616,12 @@ bpf_object__create_maps(struct bpf_object *obj)
          char *cp, errmsg[STRERR_BUFSIZE];
          unsigned int i, j;
          int err;
+       bool retried = false;

retried has to be reset for each map, so just move it inside the for
loop? you can also generalize it to retry_cnt (> 1 attempts) to allow
for more extreme cases of multiple loaders fighting very heavily

If we move "retried = false" to inside the loop, then there is no need
for retry_cnt. Single retry for each map should be enough to resolve the
race. In any case, I'm going to move "retried = false", as you suggested.

Right, I was originally thinking about the case where already pinned
map might get unpinned. But then subsequently rejected the idea of
re-creating the map :) So single retry should do.




          for (i = 0; i < obj->nr_maps; i++) {
                  map = &obj->maps[i];

+retry:
                  if (map->pin_path) {
                          err = bpf_object__reuse_map(map);
                          if (err) {
@@ -4660,9 +4662,13 @@ bpf_object__create_maps(struct bpf_object *obj)
                  if (map->pin_path && !map->pinned) {
                          err = bpf_map__pin(map, NULL);
                          if (err) {
+                               zclose(map->fd);
+                               if (!retried && err == EEXIST) {

so I'm also wondering... should we commit at this point to trying to
pin and not attempt to re-create the map? I'm worried that
bpf_object__create_map() is not designed and tested to be called
multiple times for the same bpf_map, but it's technically possible for
it to be called multiple times in this scenario. Check the inner map

Good call. I'm going to add "if (retried && map->fd < 0) { return
-ENOENT; }" after the "if (map->pinned) { err = bpf_object__reuse_map()
... }" statement. This should prevent from invoking
bpf_object__create_map() multiple times.

creation scenario, for example (btw, I think there is a bug in
bpf_object__create_map clean up for inner map, care to take a look at
that as well?).

In the case of the inner map, it should be destroyed inside
bpf_object__create_map() after a successful BPF_MAP_CREATE. So AFAIU,
there should be no need for the cleanup. Or do I miss something?

But if outer map creation fails, we won't do
bpf_map__destroy(map->inner_map);, which is one bug. And then with
your retry logic we also don't clean up the internal state of the
bpf_map, which is another one. It would be good to add a self-test
simulating such situations (e.g., by specifying wrong key_size for
outer_map, but correct inner_map definition). Not sure how to reliably
simulate this pinning race, though.

Regarding the second case (i.e., not cleaning up the internal state), I think no additional cleanup is needed with this patch [1] (main diff from prev vsn is that we call bpf_object__map_create() only once).

The relevant calls are the following:

- bpf_object__create_map(): map->inner_map is destroyed anyway after a successful call, map->fd is closed if pinning fails. - bpf_object__populate_internal_map(): created map elements will be destroyed upon close(map->fd).
- init_map_slots(): slots are freed after their initialization.

[1]: https://gist.github.com/brb/fff66e47586373fdc1fe39b88175036c


Can you please add at least the first test case?



So unless we want to allow map re-creation if (in a highly unlikely
scenario) someone already unpinned the other instance, I'd say we
should just bpf_map__pin() here directly, maybe in a short loop to
allow for a few attempts.

+                                       retried = true;
+                                       goto retry;
+                               }
                                  pr_warn("map '%s': failed to auto-pin at '%s': %d\n",
                                          map->name, map->pin_path, err);
-                               zclose(map->fd);
                                  goto err_out;
                          }
                  }
--
2.32.0




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux