On Wed, Jun 2, 2021 at 3:34 PM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > On Wed, May 26, 2021 at 9:03 PM Alexei Starovoitov > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > From: Alexei Starovoitov <ast@xxxxxxxxxx> > > > > Add appropriate safety checks for bpf_timer: > > - restrict to array, hash, lru. per-cpu maps cannot be supported. > > - kfree bpf_timer during map_delete_elem and map_free. > > - verifier btf checks. > > - safe interaction with lookup/update/delete operations and iterator. > > - relax the first field only requirement of the previous patch. > > - allow bpf_timer in global data and search for it in datasec. I'll mention it here for completeness. I don't think safety implications are worth it to support timer or spinlock in memory-mapped maps. It's way too easy to abuse it (or even accidentally corrupt kernel state). Sure it's nice, but doing an explicit single-element map for "global" timer is just fine. And it generalizes nicely to having 2, 3, ..., N timers. > > - check prog_rdonly, frozen flags. > > - mmap is allowed. otherwise global timer is not possible. > > > > Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx> > > --- > > include/linux/bpf.h | 36 +++++++++++++----- > > include/linux/btf.h | 1 + > > kernel/bpf/arraymap.c | 7 ++++ > > kernel/bpf/btf.c | 77 +++++++++++++++++++++++++++++++------- > > kernel/bpf/hashtab.c | 53 ++++++++++++++++++++------ > > kernel/bpf/helpers.c | 2 +- > > kernel/bpf/local_storage.c | 4 +- > > kernel/bpf/syscall.c | 23 ++++++++++-- > > kernel/bpf/verifier.c | 30 +++++++++++++-- > > 9 files changed, 190 insertions(+), 43 deletions(-) > > > > [...] > > > /* copy everything but bpf_spin_lock */ > > static inline void copy_map_value(struct bpf_map *map, void *dst, void *src) > > { > > + u32 off = 0, size = 0; > > + > > if (unlikely(map_value_has_spin_lock(map))) { > > - u32 off = map->spin_lock_off; > > + off = map->spin_lock_off; > > + size = sizeof(struct bpf_spin_lock); > > + } else if (unlikely(map_value_has_timer(map))) { > > + off = map->timer_off; > > + size = sizeof(struct bpf_timer); > > + } > > so the need to handle 0, 1, or 2 gaps seems to be the only reason to > disallow both bpf_spinlock and bpf_timer in one map element, right? > Isn't it worth addressing it from the very beginning to lift the > artificial restriction? E.g., for speed, you'd do: > > if (likely(neither spinlock nor timer)) { > /* fastest pass */ > } else if (only one of spinlock or timer) { > /* do what you do here */ > } else { > int off1, off2, sz1, sz2; > > if (spinlock_off < timer_off) { > off1 = spinlock_off; > sz1 = spinlock_sz; > off2 = timer_off; > sz2 = timer_sz; > } else { > ... you get the idea > } > > memcpy(0, off1); > memcpy(off1+sz1, off2); > memcpy(off2+sz2, total_sz); > } > > It's not that bad, right? > > > > > + if (unlikely(size)) { > > memcpy(dst, src, off); > > - memcpy(dst + off + sizeof(struct bpf_spin_lock), > > - src + off + sizeof(struct bpf_spin_lock), > > - map->value_size - off - sizeof(struct bpf_spin_lock)); > > + memcpy(dst + off + size, > > + src + off + size, > > + map->value_size - off - size); > > } else { > > memcpy(dst, src, map->value_size); > > } > > [...] > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index f386f85aee5c..0a828dc4968e 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -3241,6 +3241,15 @@ static int check_map_access(struct bpf_verifier_env *env, u32 regno, > > return -EACCES; > > } > > } > > + if (map_value_has_timer(map)) { > > + u32 t = map->timer_off; > > + > > + if (reg->smin_value + off < t + sizeof(struct bpf_timer) && > > <= ? Otherwise we allow accessing the first byte, unless I'm mistaken. > > > + t < reg->umax_value + off + size) { > > + verbose(env, "bpf_timer cannot be accessed directly by load/store\n"); > > + return -EACCES; > > + } > > + } > > return err; > > } > > > > @@ -4675,9 +4684,24 @@ static int process_timer_func(struct bpf_verifier_env *env, int regno, > > map->name); > > return -EINVAL; > > } > > - if (val) { > > - /* todo: relax this requirement */ > > - verbose(env, "bpf_timer field can only be first in the map value element\n"); > > ok, this was confusing, but now I see why you did that... > > > + if (!map_value_has_timer(map)) { > > + if (map->timer_off == -E2BIG) > > + verbose(env, > > + "map '%s' has more than one 'struct bpf_timer'\n", > > + map->name); > > + else if (map->timer_off == -ENOENT) > > + verbose(env, > > + "map '%s' doesn't have 'struct bpf_timer'\n", > > + map->name); > > + else > > + verbose(env, > > + "map '%s' is not a struct type or bpf_timer is mangled\n", > > + map->name); > > + return -EINVAL; > > + } > > + if (map->timer_off != val + reg->off) { > > + verbose(env, "off %lld doesn't point to 'struct bpf_timer' that is at %d\n", > > + val + reg->off, map->timer_off); > > return -EINVAL; > > } > > WARN_ON(meta->map_ptr); > > -- > > 2.30.2 > >