On Wed, Dec 20, 2023 at 9:06 AM Andrei Matei <andreimatei1@xxxxxxxxx> wrote: > > This patch improves the verifier error messages for illegal zero-sized > memory accesses. The previous patch made these messages focus on the > register containing the size of the access, instead of focusing on the > register containing the dereferenced pointer. This was more correct, but > removed useful information about the pointer. This patch brings this > information back and then some. We now have complete error messages that > are also consistent across pointer types. > > Signed-off-by: Andrei Matei <andreimatei1@xxxxxxxxx> > --- > kernel/bpf/verifier.c | 63 ++++++++++++++++++- > .../bpf/progs/verifier_helper_value_access.c | 10 +-- > .../selftests/bpf/progs/verifier_raw_stack.c | 4 +- > 3 files changed, 69 insertions(+), 8 deletions(-) > I left a bunch of nitpicky comments. TBH, I now feel that the simple message you added in the previous patch is probably adequate for the error it is notifying about. Sorry for making you work on adding more and now pushing back against it. But if others like the improved message, then I don't feel strongly enough to argue against :) > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 4409b8f2b0f3..6f333c5c47f8 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -7256,6 +7256,67 @@ static int check_helper_mem_access(struct bpf_verifier_env *env, int regno, > } > } > > +/* Helper function for logging an error about an invalid attempt to perform a > + * (possibly) zero-sized memory access. The pointer being dereferenced is in > + * register @ptr_regno, and the size of the access is in register @size_regno. > + * The size register is assumed to either be a constant zero or have a zero lower > + * bound. > + * > + * Logs a message like: > + * invalid zero-size read. Size comes from R2=0. Attempting to dereference *map_value R1: off=[0,4] value_size=48 > + */ > +static void log_zero_size_access_err(struct bpf_verifier_env *env, > + int ptr_regno, > + int size_regno) > +{ > + struct bpf_reg_state *ptr_reg = &cur_regs(env)[ptr_regno]; > + struct bpf_reg_state *size_reg = &cur_regs(env)[size_regno]; > + const bool size_is_const = tnum_is_const(size_reg->var_off); > + const char *ptr_type_str = reg_type_str(env, ptr_reg->type); > + /* allocate a few buffers to be used as parts of the error message */ > + char size_range_buf[32] = {0}, max_size_buf[32] = {0}, off_buf[64] = {0}; > + s64 min_off, max_off; > + > + if (!size_is_const) { > + snprintf(size_range_buf, sizeof(size_range_buf), > + "[0,%lld]", size_reg->umax_value); > + } > + > + if (tnum_is_const(ptr_reg->var_off)) { > + min_off = (s64)ptr_reg->var_off.value + ptr_reg->off; > + snprintf(off_buf, sizeof(off_buf), "%lld", min_off); > + } else { > + min_off = ptr_reg->smin_value + ptr_reg->off; > + max_off = ptr_reg->smax_value + ptr_reg->off; > + snprintf(off_buf, sizeof(off_buf), "[%lld,%lld]", min_off, max_off); > + } > + if var_off is const, smin==smax (unless some unrelated bug). so I'd just normalize this entire if/else to always be printing [%lld,%lld] <- [smin+off, smax+off]. It's a bit more verbose for const cases, but it feels acceptable for me. > + /* attempt to figure out info about the maximum offset that could be allowed */ > + switch (ptr_reg->type) { > + case PTR_TO_MAP_KEY: > + snprintf(max_size_buf, sizeof(max_size_buf), "key_size=%d", ptr_reg->map_ptr->key_size); > + break; > + case PTR_TO_MAP_VALUE: > + snprintf(max_size_buf, sizeof(max_size_buf), "value_size=%d", ptr_reg->map_ptr->value_size); > + break; > + case PTR_TO_PACKET: > + case PTR_TO_PACKET_META: > + snprintf(max_size_buf, sizeof(max_size_buf), "packet_size=%d", ptr_reg->range); > + break; > + case PTR_TO_MEM: > + snprintf(max_size_buf, sizeof(max_size_buf), "mem_size=%d", ptr_reg->mem_size); break missing > + default: > + snprintf(max_size_buf, sizeof(max_size_buf), "max_size=N/A"); > + } again, subjective, but I feel like just calculating relevant memory region size as int and then emitting generic `mem_size=%d` is probably absolutely sufficient here (see example message below) and we won't need to use yet another buffer for formatting. > + > + verbose(env, "invalid %szero-size read. Size comes from R%d=%s. " > + "Attempting to dereference *%s R%d: off=%s %s\n", It's very subjective, and if others like it I won't insist, but the style of this multi-sentence message just doesn't fit with the rest of the verifier log format. Something along the lines of: "invalid possible zero-size read of R2 bytes from R1 mem_size=%d" it's simple and provides all the necessary information (what's the actual R2 range is not relevant to this message, the important is that zero size is possible; user should see R2 state few lines earlier in the log where it's assigned anyways; same for R1 mem_size would help a bit to identify what that thing is (e.g., which map it belongs to), but again, R1 full state will be somewhere very close by, so maybe even that is not necessary) > + size_is_const ? "" : "possibly ", > + size_regno, size_is_const ? "0" : size_range_buf, > + ptr_type_str, ptr_regno, off_buf, max_size_buf); > +} > + > + > /* verify arguments to helpers or kfuncs consisting of a pointer and an access > * size. > * > @@ -7298,7 +7359,7 @@ static int check_mem_size_reg(struct bpf_verifier_env *env, > } > > if (reg->umin_value == 0 && !zero_size_allowed) { > - verbose(env, "R%d invalid zero-sized read\n", regno); > + log_zero_size_access_err(env, regno-1, regno); code style: regno - 1 > return -EACCES; > } > [...] pw-bot: cr