On 11/19/19 7:42 AM, Andrii Nakryiko wrote: > On Mon, Nov 18, 2019 at 10:57 PM Andrii Nakryiko > <andrii.nakryiko@xxxxxxxxx> wrote: >> >> On Mon, Nov 18, 2019 at 7:21 PM Alexei Starovoitov >> <alexei.starovoitov@xxxxxxxxx> wrote: >>> >>> On Sat, Nov 16, 2019 at 11:08:06PM -0800, Andrii Nakryiko wrote: >>>> Add support for extern variables, provided to BPF program by libbpf. Currently >>>> the following extern variables are supported: >>>> - LINUX_KERNEL_VERSION; version of a kernel in which BPF program is >>>> executing, follows KERNEL_VERSION() macro convention; >>>> - CONFIG_xxx values; a set of values of actual kernel config. Tristate, >>>> boolean, and integer values are supported. Strings are not supported at >>>> the moment. >>>> >>>> All values are represented as 64-bit integers, with the follow value encoding: >>>> - for boolean values, y is 1, n or missing value is 0; >>>> - for tristate values, y is 1, m is 2, n or missing value is 0; >>>> - for integers, the values is 64-bit integer, sign-extended, if negative; if >>>> config value is missing, it's represented as 0, which makes explicit 0 and >>>> missing config value indistinguishable. If this will turn out to be >>>> a problem in practice, we'll need to deal with it somehow. >>> >>> I read that statement as there is no extensibility for such api. >> >> What do you mean exactly? >> >> Are you worried about 0 vs undefined case? I don't think it's going to >> be a problem in practice. Looking at my .config, I see that integer >> config values set to their default values are still explicitly >> specified with those values. E.g., >> >> CONFIG_HZ_1000=y >> CONFIG_HZ=1000 >> >> CONFIG_HZ default is 1000, if CONFIG_HZ_1000==y, but still I see it >> set. So while I won't claim that it's the case for any possible >> integer config, it seems to be pretty consistent in practice. >> >> Also, I see a lot of values set to explicit 0, like: >> >> CONFIG_BASE_SMALL=0 >> >> So it seems like integers are typically spelled out explicitly in real >> configs and I think this 0 default is pretty sane. >> >> Next, speaking about extensibility. Once we have BTF type info for >> externs, our possibilities are much better. It will be possible to >> support bool, int, in64 for the same bool value. Libbpf will be able >> to validate the range and fail program load if declared extern type >> doesn't match actual value type and value range. So I think >> extensibility is there, but right now we are enforcing (logically) >> everything to be uin64_t. Unfortunately, with the way externs are done >> in ELF, I don't know neither type nor size, so can't be more strict >> than that. >> >> If we really need to know whether some config value is defined or not, >> regardless of its value, we can have it by convention. E.g., >> CONFIG_DEFINED_XXX will be either 0 or 1, depending if corresponding >> CONFIG_XXX is defined explicitly or not. But I don't want to add that >> until we really have a use case where it matters. >> >>> >>>> Generally speaking, libbpf is not aware of which CONFIG_XXX values is of which >>>> expected type (bool, tristate, int), so it doesn't enforce any specific set of >>>> values and just parses n/y/m as 0/1/2, respectively. CONFIG_XXX values not >>>> found in config file are set to 0. >>> >>> This is not pretty either. >> >> What exactly: defaulting to zero or not knowing config value's type? >> Given all the options, defaulting to zero seems like the best way to >> go. >> >>> >>>> + >>>> + switch (*value) { >>>> + case 'n': >>>> + *ext_val = 0; >>>> + break; >>>> + case 'y': >>>> + *ext_val = 1; >>>> + break; >>>> + case 'm': >>>> + *ext_val = 2; >>>> + break; >> >> reading some more code from scripts/kconfig/symbol.c, I'll need to >> handle N/Y/M and 0x hexadecimals, will add in v2 after collecting some >> more feedback on this version. >> >>>> + case '"': >>>> + pr_warn("extern '%s': strings are not supported\n", >>>> + ext->name); >>>> + err = -EINVAL; >>>> + goto out; >>>> + default: >>>> + errno = 0; >>>> + *ext_val = strtoull(value, &value_end, 10); >>>> + if (errno) { >>>> + err = -errno; >>>> + pr_warn("extern '%s': failed to parse value: %d\n", >>>> + ext->name, err); >>>> + goto out; >>>> + } >>> >>> BPF has bpf_strtol() helper. I think it would be cleaner to pass whatever >>> .config has as bytes to the program and let program parse n/y/m, strings and >>> integers. >> >> Config value is not changing. This is an incredible waste of CPU >> resources to re-parse same value over and over again. And it's >> incredibly much worse usability as well. Again, once we have BTF for >> externs, we can just declare values as const char[] and then user will >> be able to do its own parsing. Until then, I think pre-parsing values >> into convenient u64 types are much better and handles all the typical >> cases. > > > One more thing I didn't realize I didn't state explicitly, because > I've been thinking and talking about that for so long now, that it > kind of internalized completely. > > These externs, including CONFIG_XXX ones, are meant to interoperate > nicely with field relocations within BPF CO-RE concept. They are, > among other things, are meant to disable parts of BPF program logic > through verifier's dead code elimination by doing something like: > > > if (CONFIG_SOME_FEATURES_ENABLED) { > BPF_CORE_READ(t, some_extra_field); > /* or */ > bpf_helper_that_only_present_when_feature_is_enabled(); > } else { > /* fallback logic */ > } > > With CONFIG_SOME_FEATURES_ENABLED not being a read-only integer > constant when BPF program is loaded, this is impossible. So it > absolutely must be some sort of easy to use integer constant. Hmm. what difference do you see between u64 and char[] ? The const propagation logic in the verifier should work the same way. If it doesn't it's a bug in the verifier and it's not ok to hack extern api to workaround the bug. What you're advocating with libbpf-side of conversion to integers reminds me of our earlier attempts with cgroup_sysctl hooks where we started with ints only to realize that in practice it's too limited. Then bpf_strtol was introduced and api got much cleaner. Same thing here. Converting char[] into ints or whatever else is the job of the program. Not of libbpf. The verifier can be taught to optimize bpf_strtol() into const when const char[] is passed in. As far as is_enabled() check doing it as 0/1 the way you're proposing has in-band signaling issues that you admitted in the commit log. For is_enabled() may be new builtin() on llvm side would be better? Something like __builtin_preserve_field_info(field, BPF_FIELD_EXISTS) but can be used on _any_ extern function or variable. Like __builtin_is_extern_resolved(extern_name); Then on libbpf side CONFIG_* that are not in config.gz won't be seen by the program (instead of seen as 0 in your proposal) and the code will look like: if (__builtin_is_extern_resolved(CONFIG_NETNS)) { ..do things; } else { } The verifier dead code elimination will take care of branches. The BPF program itself doesn't need to read the value of CONFIG_ it only needs to know whether it was defined. Such builtin would match semantics better. If CONFIG_ is tri-state doing if (*(u8*)CONFIG_FOO == 'y' || *(u8*)CONFIG_FOO == 'm') is cleaner than *(u64*)CONFIG_FOO == 1 || 2. and constant propagation in the verifier should work the same way.