[PATCH bpf] bpf: improve htab_map_get_next_key behaviour during races

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



To iterate a BPF map, userspace must use MAP_GET_NEXT_KEY and provide
the last retrieved key. The code then scans the hash table bucket
for the key and returns the key of the next item.

This presents a problem if the last retrieved key isn't present in the
hash table anymore, e.g. due to concurrent deletion. It's not possible
to ascertain the location of a key in a given bucket, so there isn't
really a correct answer. The implementation currently returns the
first key in the first bucket. This guarantees that we never skip an
existing key. However, it means that a user space program iterating
a heavily modified map may never reach the end of the hash table,
forever restarting at the beginning.

Fixing this outright is rather involved. However, we can improve slightly
by never revisiting earlier buckets. Instead of the first key in the
first bucket we return the first key in the "current" bucket. This
doesn't eliminate the problem, but makes it less likely to occur.

Prior to commit 8fe45924387b ("bpf: map_get_next_key to return first key on NULL")
passing a non-existent key to MAP_GET_NEXT_KEY was the only way to
find the first key. Hence there is a small chance that there is code that
will be broken by this change.

Fixes: 8fe45924387b ("bpf: map_get_next_key to return first key on NULL")
Signed-off-by: Lorenz Bauer <lmb@xxxxxxxxxxxxxx>
---
 kernel/bpf/hashtab.c                    | 4 +++-
 tools/testing/selftests/bpf/test_maps.c | 2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 22066a62c8c9..30f0dab488f0 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -613,6 +613,9 @@ static int htab_map_get_next_key(struct bpf_map *map, void *key, void *next_key)
 
 	head = select_bucket(htab, hash);
 
+	/* don't iterate previous buckets */
+	i = hash & (htab->n_buckets - 1);
+
 	/* lookup the key */
 	l = lookup_nulls_elem_raw(head, hash, key, key_size, htab->n_buckets);
 
@@ -630,7 +633,6 @@ static int htab_map_get_next_key(struct bpf_map *map, void *key, void *next_key)
 	}
 
 	/* no more elements in this hash list, go to the next bucket */
-	i = hash & (htab->n_buckets - 1);
 	i++;
 
 find_first_elem:
diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c
index 806b298397d3..6f351e532ddc 100644
--- a/tools/testing/selftests/bpf/test_maps.c
+++ b/tools/testing/selftests/bpf/test_maps.c
@@ -100,7 +100,7 @@ static void test_hashmap(unsigned int task, void *data)
 	assert(bpf_map_get_next_key(fd, NULL, &first_key) == 0 &&
 	       (first_key == 1 || first_key == 2));
 	assert(bpf_map_get_next_key(fd, &key, &next_key) == 0 &&
-	       (next_key == first_key));
+	       (next_key == 1 || next_key == 2));
 	assert(bpf_map_get_next_key(fd, &next_key, &next_key) == 0 &&
 	       (next_key == 1 || next_key == 2) &&
 	       (next_key != first_key));
-- 
2.20.1




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux