On Mon, May 13, 2019 at 4:20 PM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote: > > Add a callback map_lookup_elem_sys_only() that map implementations > could use over map_lookup_elem() from system call side in case the > map implementation needs to handle the latter differently than from > the BPF data path. If map_lookup_elem_sys_only() is set, this will > be preferred pick for map lookups out of user space. This hook is This is kind of surprising behavior w/ preferred vs default lookup code path. Why the desired behavior can't be achieved with an extra flag, similar to BPF_F_LOCK? It seems like it will be more explicit, more extensible and more generic approach, avoiding duplication of lookup semantics. E.g., for LRU map, with flag on lookup, one can decide whether lookup from inside BPF program (not just from syscall side!) should modify LRU ordering or not, simply by specifying extra flag. Am I missing some complication that prevents us from doing it that way? > used in a follow-up fix for LRU map, but once development window > opens, we can convert other map types from map_lookup_elem() (here, > the one called upon BPF_MAP_LOOKUP_ELEM cmd is meant) over to use > the callback to simplify and clean up the latter. > > Signed-off-by: Daniel Borkmann <daniel@xxxxxxxxxxxxx> > Acked-by: Martin KaFai Lau <kafai@xxxxxx> > --- > include/linux/bpf.h | 1 + > kernel/bpf/syscall.c | 5 ++++- > 2 files changed, 5 insertions(+), 1 deletion(-) > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h > index 59631dd..4fb3aa2 100644 > --- a/include/linux/bpf.h > +++ b/include/linux/bpf.h > @@ -36,6 +36,7 @@ struct bpf_map_ops { > void (*map_free)(struct bpf_map *map); > int (*map_get_next_key)(struct bpf_map *map, void *key, void *next_key); > void (*map_release_uref)(struct bpf_map *map); > + void *(*map_lookup_elem_sys_only)(struct bpf_map *map, void *key); > > /* funcs callable from userspace and from eBPF programs */ > void *(*map_lookup_elem)(struct bpf_map *map, void *key); > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > index ad3ccf8..cb5440b 100644 > --- a/kernel/bpf/syscall.c > +++ b/kernel/bpf/syscall.c > @@ -808,7 +808,10 @@ static int map_lookup_elem(union bpf_attr *attr) > err = map->ops->map_peek_elem(map, value); > } else { > rcu_read_lock(); > - ptr = map->ops->map_lookup_elem(map, key); > + if (map->ops->map_lookup_elem_sys_only) > + ptr = map->ops->map_lookup_elem_sys_only(map, key); > + else > + ptr = map->ops->map_lookup_elem(map, key); > if (IS_ERR(ptr)) { > err = PTR_ERR(ptr); > } else if (!ptr) { > -- > 2.9.5 >