On Tue, Feb 14, 2023 at 3:12 PM Ilya Leoshkevich <iii@xxxxxxxxxxxxx> wrote: > > MSan runs into a few false positives in libbpf. They all come from the > fact that MSan does not know anything about the bpf syscall, > particularly, what it writes to. > > Add __libbpf_mark_mem_written() function to mark memory modified by the > bpf syscall, and a few convenience wrappers. Use the abstract name (it > could be e.g. libbpf_msan_unpoison()), because it can be used for > Valgrind in the future as well. > > Signed-off-by: Ilya Leoshkevich <iii@xxxxxxxxxxxxx> > --- > tools/lib/bpf/bpf.c | 161 ++++++++++++++++++++++++++++++-- > tools/lib/bpf/btf.c | 1 + > tools/lib/bpf/libbpf.c | 1 + > tools/lib/bpf/libbpf_internal.h | 38 ++++++++ > 4 files changed, 194 insertions(+), 7 deletions(-) > > diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c > index b562019271fe..8440d38c781c 100644 > --- a/tools/lib/bpf/bpf.c > +++ b/tools/lib/bpf/bpf.c > @@ -69,6 +69,11 @@ static inline __u64 ptr_to_u64(const void *ptr) > return (__u64) (unsigned long) ptr; > } > > +static inline void *u64_to_ptr(__u64 val) > +{ > + return (void *) (unsigned long) val; > +} > + > static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr, > unsigned int size) > { > @@ -92,6 +97,8 @@ int sys_bpf_prog_load(union bpf_attr *attr, unsigned int size, int attempts) > fd = sys_bpf_fd(BPF_PROG_LOAD, attr, size); > } while (fd < 0 && errno == EAGAIN && --attempts > 0); > > + libbpf_mark_mem_written(u64_to_ptr(attr->log_buf), attr->log_size); > + > return fd; > } > > @@ -395,6 +402,26 @@ int bpf_map_update_elem(int fd, const void *key, const void *value, > return libbpf_err_errno(ret); > } > > +/* Tell memory checkers that the given value of the given map is initialized. */ > +static void libbpf_mark_map_value_written(int fd, void *value) nit: fd -> map_fd > +{ > +#ifdef HAVE_LIBBPF_MARK_MEM_WRITTEN > + struct bpf_map_info info; > + __u32 info_len; > + size_t size; > + int err; > + > + info_len = sizeof(info); nit: just initialize variable inline? > + err = bpf_map_get_info_by_fd(fd, &info, &info_len); > + if (!err) { > + size = info.value_size; > + if (is_percpu_bpf_map_type(info.type)) > + size = roundup(size, 8) * libbpf_num_possible_cpus(); > + libbpf_mark_mem_written(value, size); > + } > +#endif > +} > + > int bpf_map_lookup_elem(int fd, const void *key, void *value) > { > const size_t attr_sz = offsetofend(union bpf_attr, flags); > @@ -407,6 +434,8 @@ int bpf_map_lookup_elem(int fd, const void *key, void *value) > attr.value = ptr_to_u64(value); > > ret = sys_bpf(BPF_MAP_LOOKUP_ELEM, &attr, attr_sz); > + if (!ret) > + libbpf_mark_map_value_written(fd, value); > return libbpf_err_errno(ret); ok, so libbpf_err_errno() relies on errno to be correct for last bpf() syscall, but libbpf_mark_map_value_written() can clobber it. We need to make libbpf_mark_map_value_written() "transparent" as far as errno goes. Let's store and restore it internally. Also, instead of adding all these `if (!ret)`, let's pass ret into libbpf_mark_map_value_written() and do that check internally? Basically, let's make all these MSan-related "annotations" as invisible in the code as possible. > } > > @@ -423,6 +452,8 @@ int bpf_map_lookup_elem_flags(int fd, const void *key, void *value, __u64 flags) > attr.flags = flags; > > ret = sys_bpf(BPF_MAP_LOOKUP_ELEM, &attr, attr_sz); > + if (!ret) > + libbpf_mark_map_value_written(fd, value); > return libbpf_err_errno(ret); > } > [...] > > +/* Helper macros for telling memory checkers that an array pointed to by > + * a struct bpf_{btf,link,map,prog}_info member is initialized. Before doing > + * that, they make sure that kernel has provided the respective member. > + */ > + > +/* Handle arrays with a certain element size. */ > +#define __MARK_INFO_ARRAY_WRITTEN(ptr, nr, elem_size) do { \ > + if (info_len >= offsetofend(typeof(*info), ptr) && \ > + info_len >= offsetofend(typeof(*info), nr) && \ > + info->ptr) \ > + libbpf_mark_mem_written(u64_to_ptr(info->ptr), \ > + info->nr * elem_size); \ > +} while (0) > + > +/* Handle arrays with a certain element type. */ > +#define MARK_INFO_ARRAY_WRITTEN(ptr, nr, type) \ > + __MARK_INFO_ARRAY_WRITTEN(ptr, nr, sizeof(type)) > + > +/* Handle arrays with element size defined by a struct member. */ > +#define MARK_INFO_REC_ARRAY_WRITTEN(ptr, nr, rec_size) do { \ > + if (info_len >= offsetofend(typeof(*info), rec_size)) \ > + __MARK_INFO_ARRAY_WRITTEN(ptr, nr, info->rec_size); \ > +} while (0) > + > +/* Handle null-terminated strings. */ > +#define MARK_INFO_STR_WRITTEN(ptr, nr) do { \ > + if (info_len >= offsetofend(typeof(*info), ptr) && \ > + info_len >= offsetofend(typeof(*info), nr) && \ > + info->ptr) \ > + libbpf_mark_mem_written(u64_to_ptr(info->ptr), \ > + info->nr + 1); \ > +} while (0) > + > +/* Helper functions for telling memory checkers that arrays pointed to by > + * bpf_{btf,link,map,prog}_info members are initialized. > + */ > + > +static void mark_prog_info_written(struct bpf_prog_info *info, __u32 info_len) > +{ > + MARK_INFO_ARRAY_WRITTEN(map_ids, nr_map_ids, __u32); > + MARK_INFO_ARRAY_WRITTEN(jited_ksyms, nr_jited_ksyms, __u64); > + MARK_INFO_ARRAY_WRITTEN(jited_func_lens, nr_jited_func_lens, __u32); > + MARK_INFO_REC_ARRAY_WRITTEN(func_info, nr_func_info, > + func_info_rec_size); > + MARK_INFO_REC_ARRAY_WRITTEN(line_info, nr_line_info, > + line_info_rec_size); > + MARK_INFO_REC_ARRAY_WRITTEN(jited_line_info, nr_jited_line_info, > + jited_line_info_rec_size); > + MARK_INFO_ARRAY_WRITTEN(prog_tags, nr_prog_tags, __u8[BPF_TAG_SIZE]); > +} > + > +static void mark_btf_info_written(struct bpf_btf_info *info, __u32 info_len) > +{ > + MARK_INFO_ARRAY_WRITTEN(btf, btf_size, __u8); > + MARK_INFO_STR_WRITTEN(name, name_len); > +} > + > +static void mark_link_info_written(struct bpf_link_info *info, __u32 info_len) > +{ > + switch (info->type) { > + case BPF_LINK_TYPE_RAW_TRACEPOINT: > + MARK_INFO_STR_WRITTEN(raw_tracepoint.tp_name, > + raw_tracepoint.tp_name_len); > + break; > + case BPF_LINK_TYPE_ITER: > + MARK_INFO_STR_WRITTEN(iter.target_name, iter.target_name_len); > + break; > + default: > + break; > + } > +} > + > +#undef MARK_INFO_STR_WRITTEN > +#undef MARK_INFO_REC_ARRAY_WRITTEN > +#undef MARK_INFO_ARRAY_WRITTEN > +#undef __MARK_INFO_ARRAY_WRITTEN Ugh... I wasn't a big fan of adding all these "mark_mem_written" across a bunch of APIs to begin with, but this part is really putting me off. I like the bpf_{map,btf,prog,btf}_info_by_fd() improvements you did, but maybe adding these MSan annotations is a bit too much? Applications that really care about this whole "do I read uninitialized memory" business could do their own simpler wrappers on top of libbpf APIs, right? Maybe we should start there first and see if there is more demand to have built-in libbpf support? BTW, is this all needed for ASan as well? One more worry I have is that given we don't exercise all these sanitizers regularly in BPF CI, we'll keep forgetting adding new annotations and all this machinery will start bit rotting. So I'd say that we should first make sure that we have sanitizer builds/runs in BPF CI, before signing up for maintaining these "annotations". > + > int bpf_prog_get_info_by_fd(int prog_fd, struct bpf_prog_info *info, > __u32 *info_len) > { > - return bpf_obj_get_info_by_fd(prog_fd, info, info_len); > + int err; > + > + err = bpf_obj_get_info_by_fd(prog_fd, info, info_len); > + if (!err) > + mark_prog_info_written(info, *info_len); > + > + return err; > } > > int bpf_map_get_info_by_fd(int map_fd, struct bpf_map_info *info, [...]