On Thu, Jun 18, 2020 at 12:09 PM John Fastabend <john.fastabend@xxxxxxxxx> wrote: > > Andrii Nakryiko wrote: > > Add selftest that validates variable-length data reading and concatentation > > with one big shared data array. This is a common pattern in production use for > > monitoring and tracing applications, that potentially can read a lot of data, > > but usually reads much less. Such pattern allows to determine precisely what > > amount of data needs to be sent over perfbuf/ringbuf and maximize efficiency. > > > > This is the first BPF selftest that at all looks at and tests > > bpf_probe_read_str()-like helper's return value, closing a major gap in BPF > > testing. It surfaced the problem with bpf_probe_read_kernel_str() returning > > 0 on success, instead of amount of bytes successfully read. > > > > Signed-off-by: Andrii Nakryiko <andriin@xxxxxx> > > --- > > [...] > > > +/* .data */ > > +int payload2_len1 = -1; > > +int payload2_len2 = -1; > > +int total2 = -1; > > +char payload2[MAX_LEN + MAX_LEN] = { 1 }; > > + > > +SEC("raw_tp/sys_enter") > > +int handler64(void *regs) > > +{ > > + int pid = bpf_get_current_pid_tgid() >> 32; > > + void *payload = payload1; > > + u64 len; > > + > > + /* ignore irrelevant invocations */ > > + if (test_pid != pid || !capture) > > + return 0; > > + > > + len = bpf_probe_read_kernel_str(payload, MAX_LEN, &buf_in1[0]); > > + if (len <= MAX_LEN) { > > Took me a bit grok this. You are relying on the fact that in errors, > such as a page fault, will encode to a large u64 value and so you > verifier is happy. But most of my programs actually want to distinguish > between legitimate errors on the probe vs buffer overrun cases. What buffer overrun? bpf_probe_read_str() family cannot return higher value than MAX_LEN. If you want to detect truncated strings, then you can attempt reading MAX_LEN + 1 and then check that the return result is MAX_LEN exactly. But still, that would be something like: u64 len; len = bpf_probe_read_str(payload, MAX_LEN + 1, &buf); if (len > MAX_LEN) return -1; if (len == MAX_LEN) { /* truncated */ } else { /* full string */ } > > Can we make these tests do explicit check for errors. For example, > > if (len < 0) goto abort; > > But this also breaks your types here. This is what I was trying to > point out in the 1/2 patch thread. Wanted to make the point here as > well in case it wasn't clear. Not sure I did the best job explaining. > I can write *a correct* C code in a lot of ways such that it will not pass verifier verification, not sure what that will prove, though. Have you tried using the pattern with two ifs with no-ALU32? Does it work? Also you are cheating in your example (in patch #1 thread). You are exiting on the first error and do not attempt to read any more data after that. In practice, you want to get as much info as possible, even if some of string reads fail (e.g., because argv might not be paged in, but env is, or vice versa). So you'll end up doing this: len = bpf_probe_read_str(...); if (len >= 0 && len <= MAX_LEN) { payload += len; } ... ... and of course it spectacularly fails in no-ALU32. To be completely fair, this is a result of Clang optimization and Yonghong is trying to deal with it as we speak. Switching int to long for helpers doesn't help it either. But there are better code patterns (unsigned len + single if check) that do work with both ALU32 and no-ALU32. And I just double-checked, this pattern keeps working for ALU32 with both int and long types, so maybe there are unnecessary bit shifts, but at least code is still verifiable. So my point stands. int -> long helps in some cases and doesn't hurt in others, so I argue that it's a good thing to do :) > > + payload += len; > > + payload1_len1 = len; > > + } > > + > > + len = bpf_probe_read_kernel_str(payload, MAX_LEN, &buf_in2[0]); > > + if (len <= MAX_LEN) { > > + payload += len; > > + payload1_len2 = len; > > + } > > + > > + total1 = payload - (void *)payload1; > > + > > + return 0; > > +}