> On 10/25/21 4:28 PM, Lorenzo Bianconi wrote: > > > Lorenzo noticed that the code testing for program type compatibility of > > > tail call maps is potentially racy in that two threads could encounter a > > > map with an unset type simultaneously and both return true even though they > > > are inserting incompatible programs. > > > > > > The race window is quite small, but artificially enlarging it by adding a > > > usleep_range() inside the check in bpf_prog_array_compatible() makes it > > > trivial to trigger from userspace with a program that does, essentially: > > > > > > map_fd = bpf_create_map(BPF_MAP_TYPE_PROG_ARRAY, 4, 4, 2, 0); > > > pid = fork(); > > > if (pid) { > > > key = 0; > > > value = xdp_fd; > > > } else { > > > key = 1; > > > value = tc_fd; > > > } > > > err = bpf_map_update_elem(map_fd, &key, &value, 0); > > > > > > While the race window is small, it has potentially serious ramifications in > > > that triggering it would allow a BPF program to tail call to a program of a > > > different type. So let's get rid of it by protecting the update with a > > > spinlock. The commit in the Fixes tag is the last commit that touches the > > > code in question. > > > > > > v2: > > > - Use a spinlock instead of an atomic variable and cmpxchg() (Alexei) > > > > > > Fixes: 3324b584b6f6 ("ebpf: misc core cleanup") > > > Reported-by: Lorenzo Bianconi <lorenzo.bianconi@xxxxxxxxxx> > > > Signed-off-by: Toke Høiland-Jørgensen <toke@xxxxxxxxxx> > > > --- > > > include/linux/bpf.h | 1 + > > > kernel/bpf/arraymap.c | 1 + > > > kernel/bpf/core.c | 14 ++++++++++---- > > > kernel/bpf/syscall.c | 2 ++ > > > 4 files changed, 14 insertions(+), 4 deletions(-) > > > > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h > > > index 020a7d5bf470..98d906176d89 100644 > > > --- a/include/linux/bpf.h > > > +++ b/include/linux/bpf.h > > > @@ -929,6 +929,7 @@ struct bpf_array_aux { > > > * stored in the map to make sure that all callers and callees have > > > * the same prog type and JITed flag. > > > */ > > > + spinlock_t type_check_lock; > > > > I was wondering if we can use a mutex instead of a spinlock here since it is > > run from a syscall AFAIU. The only downside is mutex_lock is run inside > > aux->used_maps_mutex critical section. Am I missing something? > > Hm, potentially it could work, but then it's also 32 vs 4 extra bytes. There's > also poke_mutex or freeze_mutex, but feels to hacky to 'generalize for reuse', > so I think the spinlock in bpf_array_aux is fine. I was wondering if in the future we would need to protect something not supported by a spinlock but it is probably not the case. I am fine with the spinlock :) Regards, Lorenzo > > > > enum bpf_prog_type type; > > > bool jited; > > > /* Programs with direct jumps into programs part of this array. */ > > > diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c > > > index cebd4fb06d19..da9b1e96cadc 100644 > > > --- a/kernel/bpf/arraymap.c > > > +++ b/kernel/bpf/arraymap.c > > > @@ -1072,6 +1072,7 @@ static struct bpf_map *prog_array_map_alloc(union bpf_attr *attr) > > > INIT_WORK(&aux->work, prog_array_map_clear_deferred); > > > INIT_LIST_HEAD(&aux->poke_progs); > > > mutex_init(&aux->poke_mutex); > > > + spin_lock_init(&aux->type_check_lock); > > Just as a tiny nit, I would probably name it slightly different, since type_check_lock > mainly refers to the type property but there's also jit vs non-jit and as pointed out > there could be other extensions that need checking in future as well. Maybe 'compat_lock' > would be a more generic one or just: > > struct { > enum bpf_prog_type type; > bool jited; > spinlock_t lock; > } owner; > > > > map = array_map_alloc(attr); > > > if (IS_ERR(map)) { > > > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c > > > index c1e7eb3f1876..9439c839d279 100644 > > > --- a/kernel/bpf/core.c > > > +++ b/kernel/bpf/core.c > > > @@ -1823,20 +1823,26 @@ static unsigned int __bpf_prog_ret0_warn(const void *ctx, > > > bool bpf_prog_array_compatible(struct bpf_array *array, > > > const struct bpf_prog *fp) > > > { > > > + bool ret; > > > + > > > if (fp->kprobe_override) > > > return false; > > > + spin_lock(&array->aux->type_check_lock); > > > + > > > if (!array->aux->type) { > > > /* There's no owner yet where we could check for > > > * compatibility. > > > */ > > > array->aux->type = fp->type; > > > array->aux->jited = fp->jited; > > > - return true; > > > + ret = true; > > > + } else { > > > + ret = array->aux->type == fp->type && > > > + array->aux->jited == fp->jited; > > > } > > > - > > > - return array->aux->type == fp->type && > > > - array->aux->jited == fp->jited; > > > + spin_unlock(&array->aux->type_check_lock); > > > + return ret; > > > } > > > static int bpf_check_tail_call(const struct bpf_prog *fp) > > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > > > index 4e50c0bfdb7d..955011c7df29 100644 > > > --- a/kernel/bpf/syscall.c > > > +++ b/kernel/bpf/syscall.c > > > @@ -543,8 +543,10 @@ static void bpf_map_show_fdinfo(struct seq_file *m, struct file *filp) > > > if (map->map_type == BPF_MAP_TYPE_PROG_ARRAY) { > > > array = container_of(map, struct bpf_array, map); > > > + spin_lock(&array->aux->type_check_lock); > > > type = array->aux->type; > > > jited = array->aux->jited; > > > + spin_unlock(&array->aux->type_check_lock); > > > } > > > seq_printf(m, > > > -- > > > 2.33.0 > > > >
Attachment:
signature.asc
Description: PGP signature