On Sat, Jun 1, 2019 at 3:05 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Fri, May 31, 2019 at 11:39:52PM -0700, Andrii Nakryiko wrote: > > This patch adds a new test program, based on real-world production > > application, for testing BPF verifier scalability w/ realistic > > complexity. > > Thanks! > > > - const char *pyperf[] = { > > + const char *tp_progs[] = { > > I had very similar change in my repo :) > > > +struct strobemeta_payload { > > + /* req_id has valid request ID, if req_meta_valid == 1 */ > > + int64_t req_id; > > + uint8_t req_meta_valid; > > + /* > > + * mask has Nth bit set to 1, if Nth metavar was present and > > + * successfully read > > + */ > > + uint64_t int_vals_set_mask; > > + int64_t int_vals[STROBE_MAX_INTS]; > > + /* len is >0 for present values */ > > + uint16_t str_lens[STROBE_MAX_STRS]; > > + /* if map_descrs[i].cnt == -1, metavar is not present/set */ > > + struct strobe_map_descr map_descrs[STROBE_MAX_MAPS]; > > + /* > > + * payload has compactly packed values of str and map variables in the > > + * form: strval1\0strval2\0map1key1\0map1val1\0map2key1\0map2val1\0 > > + * (and so on); str_lens[i], key_lens[i] and val_lens[i] determines > > + * value length > > + */ > > + char payload[STROBE_MAX_PAYLOAD]; > > +}; > > + > > +struct strobelight_bpf_sample { > > + uint64_t ktime; > > + char comm[TASK_COMM_LEN]; > > + pid_t pid; > > + int user_stack_id; > > + int kernel_stack_id; > > + int has_meta; > > + struct strobemeta_payload metadata; > > + /* > > + * makes it possible to pass (<real payload size> + 1) as data size to > > + * perf_submit() to avoid perf_submit's paranoia about passing zero as > > + * size, as it deduces that <real payload size> might be > > + * **theoretically** zero > > + */ > > + char dummy_safeguard; > > +}; > > > +struct bpf_map_def SEC("maps") sample_heap = { > > + .type = BPF_MAP_TYPE_PERCPU_ARRAY, > > + .key_size = sizeof(uint32_t), > > + .value_size = sizeof(struct strobelight_bpf_sample), > > + .max_entries = 1, > > +}; > > due to this design the stressfulness of the test is > limited by bpf max map value limitation which comes from > alloc_percpu limit. > That makes it not as stressful as I was hoping for :) What's the limit for per-cpu allocation? You can reduce STROBE_MAX_STR_LEN to just 1 to save quite a lot of space and push settings further. > > > +#define STROBE_MAX_INTS 25 > > +#define STROBE_MAX_STRS 25 > > +#define STROBE_MAX_MAPS 5 > > +#define STROBE_MAX_MAP_ENTRIES 20 > > so I could bump STROBE_MAX_INTS to 300 and got: > verification time 302401 usec // with kasan > stack depth 464 > processed 40388 insns (limit 1000000) max_states_per_insn 6 total_states 8863 peak_states 8796 mark_read 4110 > test_scale:./strobemeta25.o:OK > > which is not that stressful comparing to some of the tests :) INTS and STRS are less complicated, try playing with MAX_MAPS and MAX_MAP_ENTRIES. E.g., I can't seem to push farther than STROBE_MAX_MAPS 15 and STROBE_MAX_MAP_ENTRIES 30, not sure if it's due to allocation limit. On the other hand, trying STROBE_MAX_MAPS 30 and STROBE_MAX_MAP_ENTRIES 15 (which should use pretty similar amount of space), I hit stack size limit. So this combination (and higher values, if possible), should be a good demo case for loops. I'm curious for you to try and let me know if you could go higher with loops support... :) To save some more space, try removing cnt, tag_len, and id from struct strobe_map_descr, you can try to reduce val_lens and key_lens to be just uint8_t. Similar thing can be done to int_vals in struct strobemeta_valid. I don't want to remove them, as they add to complexity of the program, but reducing size should be ok. BTW, it's kind of hard to understand why verif_scale case fails, would be nice to get better log output (not just stats, which are missing on failure). So consider that a feature request. ;) > > Without unroll: > verification time 435963 usec // with kasan > stack depth 488 > processed 52812 insns (limit 1000000) max_states_per_insn 26 total_states 6786 peak_states 1405 mark_read 777 > test_scale:./strobemeta25.o:OK > > So things are looking pretty good. > > I'll roll your test into my set with few tweaks. Thanks a lot! sounds good! > > btw I consistently see better code and less insn_processed in alu32 mode. > It's probably time to make it llvm default. > yep, I remember I had to explicitly cast a bunch of things to uint64_t just to avoid those pesky <<= and >>= operations, where possible. :)