On Sun, Dec 12, 2021 at 6:03 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Sun, Dec 12, 2021 at 5:51 PM Alexei Starovoitov > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > On Fri, Dec 10, 2021 at 11:56 AM Hao Luo <haoluo@xxxxxxxxxx> wrote: > > > > > > On Thu, Dec 9, 2021 at 1:45 PM Alexei Starovoitov > > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > > > On Wed, Dec 8, 2021 at 12:04 PM Andrii Nakryiko > > > > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > > > > > > > > > > +static const char *reg_type_str(enum bpf_reg_type type) > > > > > > > > +{ > > > > > > > > + static const char * const str[] = { > > > > > > > > + [NOT_INIT] = "?", > > > > > > > > + [SCALAR_VALUE] = "inv", > > > > > > > > + [PTR_TO_CTX] = "ctx", > > > > > > > > + [CONST_PTR_TO_MAP] = "map_ptr", > > > > > > > > + [PTR_TO_MAP_VALUE] = "map_value", > > > > > > > > + [PTR_TO_STACK] = "fp", > > > > > > > > + [PTR_TO_PACKET] = "pkt", > > > > > > > > + [PTR_TO_PACKET_META] = "pkt_meta", > > > > > > > > + [PTR_TO_PACKET_END] = "pkt_end", > > > > > > > > + [PTR_TO_FLOW_KEYS] = "flow_keys", > > > > > > > > + [PTR_TO_SOCKET] = "sock", > > > > > > > > + [PTR_TO_SOCK_COMMON] = "sock_common", > > > > > > > > + [PTR_TO_TCP_SOCK] = "tcp_sock", > > > > > > > > + [PTR_TO_TP_BUFFER] = "tp_buffer", > > > > > > > > + [PTR_TO_XDP_SOCK] = "xdp_sock", > > > > > > > > + [PTR_TO_BTF_ID] = "ptr_", > > > > > > > > + [PTR_TO_PERCPU_BTF_ID] = "percpu_ptr_", > > > > > > > > + [PTR_TO_MEM] = "mem", > > > > > > > > + [PTR_TO_RDONLY_BUF] = "rdonly_buf", > > > > > > > > + [PTR_TO_RDWR_BUF] = "rdwr_buf", > > > > > > > > + [PTR_TO_FUNC] = "func", > > > > > > > > + [PTR_TO_MAP_KEY] = "map_key", > > > > > > > > + }; > > > > > > > > + > > > > > > > > + return str[base_type(type)]; > > > > > > > > > > > > > > well... now we lose the "_or_null" part, that can be pretty important. > > > > > > > Same will happen with RDONLY flag, right? > > > > > > > > > > > > > > What can we do about that? > > > > > > > > > > > > > > > > > > > We can format the string into a global buffer and return the buffer to > > > > > > the caller. A little overhead on string formatting. > > > > > > > > > > Can't use global buffer, because multiple BPF verifications can > > > > > proceed at the same time. > > > > > > > > I think reg_type_str() can still return a const char string > > > > with flags. > > > > It's not going to be a direct str[base_type(type)]; of course. > > > > The str[] would be different. > > > > > > Sorry I didn't fully grasp this comment. Following is my understanding > > > and thoughts so far. Let me know if I missed anything. > > > > I meant that for now we can do: > > > > static const char * const str[] = { > > [NOT_INIT] = "?", > > [PTR_TO_RDONLY_BUF] = "rdonly_buf", > > }; > > switch(type) { > > case PTR_TO_RDONLY_BUF | MAYBE_NULL:return "rdonly_buf_or_null"; > > default: return str[base_type(type)]; > > } > > > > > Right, we can return a char string, but the string must be provided by > > > the caller, which is the annoying part. We could do something as > > > following (solution 1) > > > > > > const char *reg_type_str(..., char *buf) > > > { > > > ... > > > snprintf(buf, ...); > > > return buf; > > > } > > > > > > OR, (solution 2) we may embed a buffer in bpf_verifier_env and pass > > > env into reg_type_str(). reg_type_str() just returns the buffer in > > > env. This doesn't require the caller to prepare the buffer. > > > > > > const char *reg_type_str(..., env) > > > { > > > ... > > > snprintf(env->buf, ...); > > > return env->buf; > > > } > > > > If we go with this approach then passing 'env' is a bit better, > > since 'buf' doesn't need to be allocated on stack. > > > > > However, there will be a caveat when there are more than one > > > reg_type_str in one verbose statement. In solution 1, the input buf > > > has to be different. In solution 2, we just can't do that, because the > > > buffer is implicitly shared. What do you think about solution 2? > > > > There is only one verbose() with two reg_type_str[] right now. > > It can easily be split into two verbose(). > > > > > > > > > > If that doesn't work out we can try: > > > > s/verbose(,"%s", reg_type_str[]) > > > > /verbose(, "%s%s", reg_type_str(), reg_type_flags_str()) > > > > > > This is probably more cumbersome IMHO. > > > > > > Thinking about the implementation of reg_type_flags_str(). Because > > > flags can be combined, I think it would be better to format a dynamic > > > buffer, then we are back to the same problem: caller needs to pass a > > > buffer. Of course, with only two flags right now, we could enumerate > > > all flag combinations and map them to const strings. > > > > Yeah. With 3 or more flags that can be combined the const char approach > > wouldn't scale. So let's try 'env' approach? > > > > The only tweak is that 'log' would be better than 'env'. > > That temp buf can be in struct bpf_verifier_log *log. > > Another idea would be to add bpf specific modifiers in > pointer() in lib/vsprintf.c for verifier's reg_state, reg_type, > and other bpf structs. > It might make print_verifier_state() much cleaner and > would allow easy printing of reg state from anywhere inside > the verifier code. > Currently the error print has minimal info. Like: > verbose(env, "R%d invalid %s access off=%d size=%d\n", > regno, reg_type_str[reg->type], off, size); > With vsnprintf support we could do something like: > verbose(env, "%Brp invalid access off=%d size=%d\n", > reg, off, size); > and pointer() will do full print of struct bpf_reg_state. > > This is probably something to consider long term. ACK. Thanks for the comments. The 'env' proposal looks reasonably well. I sent out v2. Please take a look there.