Re: [PATCH bpf v2] bpf: fix potential race in tail call compatibility check

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/25/21 4:28 PM, Lorenzo Bianconi wrote:
Lorenzo noticed that the code testing for program type compatibility of
tail call maps is potentially racy in that two threads could encounter a
map with an unset type simultaneously and both return true even though they
are inserting incompatible programs.

The race window is quite small, but artificially enlarging it by adding a
usleep_range() inside the check in bpf_prog_array_compatible() makes it
trivial to trigger from userspace with a program that does, essentially:

         map_fd = bpf_create_map(BPF_MAP_TYPE_PROG_ARRAY, 4, 4, 2, 0);
         pid = fork();
         if (pid) {
                 key = 0;
                 value = xdp_fd;
         } else {
                 key = 1;
                 value = tc_fd;
         }
         err = bpf_map_update_elem(map_fd, &key, &value, 0);

While the race window is small, it has potentially serious ramifications in
that triggering it would allow a BPF program to tail call to a program of a
different type. So let's get rid of it by protecting the update with a
spinlock. The commit in the Fixes tag is the last commit that touches the
code in question.

v2:
- Use a spinlock instead of an atomic variable and cmpxchg() (Alexei)

Fixes: 3324b584b6f6 ("ebpf: misc core cleanup")
Reported-by: Lorenzo Bianconi <lorenzo.bianconi@xxxxxxxxxx>
Signed-off-by: Toke Høiland-Jørgensen <toke@xxxxxxxxxx>
---
  include/linux/bpf.h   |  1 +
  kernel/bpf/arraymap.c |  1 +
  kernel/bpf/core.c     | 14 ++++++++++----
  kernel/bpf/syscall.c  |  2 ++
  4 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 020a7d5bf470..98d906176d89 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -929,6 +929,7 @@ struct bpf_array_aux {
  	 * stored in the map to make sure that all callers and callees have
  	 * the same prog type and JITed flag.
  	 */
+	spinlock_t type_check_lock;

I was wondering if we can use a mutex instead of a spinlock here since it is
run from a syscall AFAIU. The only downside is mutex_lock is run inside
aux->used_maps_mutex critical section. Am I missing something?

Hm, potentially it could work, but then it's also 32 vs 4 extra bytes. There's
also poke_mutex or freeze_mutex, but feels to hacky to 'generalize for reuse',
so I think the spinlock in bpf_array_aux is fine.

  	enum bpf_prog_type type;
  	bool jited;
  	/* Programs with direct jumps into programs part of this array. */
diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index cebd4fb06d19..da9b1e96cadc 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -1072,6 +1072,7 @@ static struct bpf_map *prog_array_map_alloc(union bpf_attr *attr)
  	INIT_WORK(&aux->work, prog_array_map_clear_deferred);
  	INIT_LIST_HEAD(&aux->poke_progs);
  	mutex_init(&aux->poke_mutex);
+	spin_lock_init(&aux->type_check_lock);

Just as a tiny nit, I would probably name it slightly different, since type_check_lock
mainly refers to the type property but there's also jit vs non-jit and as pointed out
there could be other extensions that need checking in future as well. Maybe 'compat_lock'
would be a more generic one or just:

        struct {
                enum bpf_prog_type type;
                bool jited;
                spinlock_t lock;
        } owner;

  	map = array_map_alloc(attr);
  	if (IS_ERR(map)) {
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index c1e7eb3f1876..9439c839d279 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1823,20 +1823,26 @@ static unsigned int __bpf_prog_ret0_warn(const void *ctx,
  bool bpf_prog_array_compatible(struct bpf_array *array,
  			       const struct bpf_prog *fp)
  {
+	bool ret;
+
  	if (fp->kprobe_override)
  		return false;
+ spin_lock(&array->aux->type_check_lock);
+
  	if (!array->aux->type) {
  		/* There's no owner yet where we could check for
  		 * compatibility.
  		 */
  		array->aux->type  = fp->type;
  		array->aux->jited = fp->jited;
-		return true;
+		ret = true;
+	} else {
+		ret = array->aux->type  == fp->type &&
+		      array->aux->jited == fp->jited;
  	}
-
-	return array->aux->type  == fp->type &&
-	       array->aux->jited == fp->jited;
+	spin_unlock(&array->aux->type_check_lock);
+	return ret;
  }
static int bpf_check_tail_call(const struct bpf_prog *fp)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 4e50c0bfdb7d..955011c7df29 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -543,8 +543,10 @@ static void bpf_map_show_fdinfo(struct seq_file *m, struct file *filp)
if (map->map_type == BPF_MAP_TYPE_PROG_ARRAY) {
  		array = container_of(map, struct bpf_array, map);
+		spin_lock(&array->aux->type_check_lock);
  		type  = array->aux->type;
  		jited = array->aux->jited;
+		spin_unlock(&array->aux->type_check_lock);
  	}
seq_printf(m,
--
2.33.0





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux