Re: [PATCH bpf-next 3/6] selftest/bpf/benchs: enhance argp parsing

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Mon, 30 Jan 2023 16:07:17 -0800

On Fri, Jan 27, 2023 at 10:14 AM Anton Protopopov <aspsk@xxxxxxxxxxxxx> wrote:
>
> To parse command line the bench utility uses the argp_parse() function. This
> function takes as an argument a parent 'struct argp' structure which defines
> common command line options and an array of children 'struct argp' structures
> which defines additional command line options for particular benchmarks. This
> implementation doesn't allow benchmarks to share option names, e.g., if two
> benchmarks want to use, say, the --option option, then only one of them will
> succeed (the first one encountered in the array).  This will be convenient if
> we could use the same option names in different benchmarks (with the same
> semantics, e.g., --nr_loops=N).
>
> Fix this by calling the argp_parse() function twice. The first call is needed
> to find the benchmark name. Given the name, we can patch the list of argp
> children to only include the correct list of option. This way the right parser
> will be executed. (If there's no a specific list of arguments, then only one
> call is enough.) As was mentioned above, arguments with same names should have
> the same semantics (which means in this case "taking a parameter or not"), but
> can have different description and will have different parsers.
>
> As we now can share option names, this also makes sense to share the option ids.
> Previously every benchmark defined a separate enum, like
>
>     enum {
>            ARG_SMTH = 9000,
>            ARG_SMTH_ELSE = 9001,
>     };
>
> These ids were later used to distinguish between command line options:
>
>     static const struct argp_option opts[] = {
>             { "smth", ARG_SMTH, "SMTH", 0,
>                     "Set number of smth to configure smth"},
>             { "smth_else", ARG_SMTH_ELSE, "SMTH_ELSE", 0,
>                     "Set number of smth_else to configure smth else"},
>             ...
>
> Move arguments id definitions to bench.h such that they are visible to every
> benchmark (and such that there's no need to grep if this number is defined
> already or not).

On the other hand, if each benchmark defines their own set of IDs and
parser, that keeps benchmarks more self-contained. Is there a real
need to centralize everything? I don't see much benefit, tbh.

If we want to centralize, then we can just bypass all the child parser
machinery and have one centralized list of arguments. But I think it's
good to have each benchmark self-contained and independent from other
benchmarks.

>
> Signed-off-by: Anton Protopopov <aspsk@xxxxxxxxxxxxx>
> ---
>  tools/testing/selftests/bpf/bench.c           | 98 +++++++++++++++++--
>  tools/testing/selftests/bpf/bench.h           | 20 ++++
>  .../bpf/benchs/bench_bloom_filter_map.c       |  6 --
>  .../selftests/bpf/benchs/bench_bpf_loop.c     |  4 -
>  .../bpf/benchs/bench_local_storage.c          |  5 -
>  .../bench_local_storage_rcu_tasks_trace.c     |  6 --
>  .../selftests/bpf/benchs/bench_ringbufs.c     |  8 --
>  .../selftests/bpf/benchs/bench_strncmp.c      |  4 -
>  8 files changed, 110 insertions(+), 41 deletions(-)
>

[...]

> +struct argp *bench_name_to_argp(const char *bench_name)
> +{
> +
> +#define _SCMP(NAME) (!strcmp(bench_name, NAME))
> +
> +       if (_SCMP("bloom-lookup") ||
> +           _SCMP("bloom-update") ||
> +           _SCMP("bloom-false-positive") ||
> +           _SCMP("hashmap-without-bloom") ||
> +           _SCMP("hashmap-with-bloom"))
> +               return &bench_bloom_map_argp;
> +
> +       if (_SCMP("rb-libbpf") ||
> +           _SCMP("rb-custom") ||
> +           _SCMP("pb-libbpf") ||
> +           _SCMP("pb-custom"))
> +               return &bench_ringbufs_argp;
> +
> +       if (_SCMP("local-storage-cache-seq-get") ||
> +           _SCMP("local-storage-cache-int-get") ||
> +           _SCMP("local-storage-cache-hashmap-control"))
> +               return &bench_local_storage_argp;
> +
> +       if (_SCMP("local-storage-tasks-trace"))
> +               return &bench_local_storage_rcu_tasks_trace_argp;
> +
> +       if (_SCMP("strncmp-no-helper") ||
> +           _SCMP("strncmp-helper"))
> +               return &bench_strncmp_argp;
> +
> +       if (_SCMP("bpf-loop"))
> +               return &bench_bpf_loop_argp;
> +
> +       /* no extra arguments */
> +       if (_SCMP("count-global") ||
> +           _SCMP("count-local") ||
> +           _SCMP("rename-base") ||
> +           _SCMP("rename-kprobe") ||
> +           _SCMP("rename-kretprobe") ||
> +           _SCMP("rename-rawtp") ||
> +           _SCMP("rename-fentry") ||
> +           _SCMP("rename-fexit") ||
> +           _SCMP("trig-base") ||
> +           _SCMP("trig-tp") ||
> +           _SCMP("trig-rawtp") ||
> +           _SCMP("trig-kprobe") ||
> +           _SCMP("trig-fentry") ||
> +           _SCMP("trig-fentry-sleep") ||
> +           _SCMP("trig-fmodret") ||
> +           _SCMP("trig-uprobe-base") ||
> +           _SCMP("trig-uprobe-with-nop") ||
> +           _SCMP("trig-uretprobe-with-nop") ||
> +           _SCMP("trig-uprobe-without-nop") ||
> +           _SCMP("trig-uretprobe-without-nop") ||
> +           _SCMP("bpf-hashmap-full-update"))
> +               return NULL;
> +
> +#undef _SCMP
> +

it's not good to have to maintain a separate list of benchmark names
here. Let's maybe extend struct bench to specify extra parser and use
that to figure out if we need to run nested child parser?

> +       fprintf(stderr, "%s: bench %s is unknown\n", __func__, bench_name);
> +       exit(1);
> +}
> +
>  static void parse_cmdline_args(int argc, char **argv)
>  {
>         static const struct argp argp = {
> @@ -367,12 +426,35 @@ static void parse_cmdline_args(int argc, char **argv)
>                 .doc = argp_program_doc,
>                 .children = bench_parsers,
>         };
> +       static struct argp *bench_argp;

nit: do you need static?

> +
> +       /* Parse args for the first time to get bench name */
>         if (argp_parse(&argp, argc, argv, 0, NULL, NULL))
>                 exit(1);
> -       if (!env.list && !env.bench_name) {
> +
> +       if (env.list)
> +               return;
> +
> +       if (!env.bench_name) {
>                 argp_help(&argp, stderr, ARGP_HELP_DOC, "bench");
>                 exit(1);
>         }
> +
> +       /* Now check if there are custom options available. If not, then
> +        * everything is done, if yes, then we need to patch bench_parsers
> +        * so that bench_parsers[0] points to the right 'struct argp', and
> +        * bench_parsers[1] terminates the list.
> +        */
> +       bench_argp = bench_name_to_argp(env.bench_name);
> +       if (bench_argp) {
> +               bench_parsers[0].argp = bench_argp;
> +               bench_parsers[0].header = env.bench_name;
> +               memset(&bench_parsers[1], 0, sizeof(bench_parsers[1]));
> +
> +               pos_args = 0;
> +               if (argp_parse(&argp, argc, argv, 0, NULL, NULL))
> +                       exit(1);
> +       }

this also feels like a big hack, why can't you just construct a
single-item array based on child parser, instead of overwriting global
array?

>  }
>
>  static void collect_measurements(long delta_ns);

[...]