On Sun, Oct 6, 2019 at 4:49 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Sat, Oct 05, 2019 at 11:36:16PM -0700, Andrii Nakryiko wrote: > > On Fri, Oct 4, 2019 at 10:08 PM Alexei Starovoitov <ast@xxxxxxxxxx> wrote: > > > > > > If in-kernel BTF exists parse it and prepare 'struct btf *btf_vmlinux' > > > for further use by the verifier. > > > In-kernel BTF is trusted just like kallsyms and other build artifacts > > > embedded into vmlinux. > > > Yet run this BTF image through BTF verifier to make sure > > > that it is valid and it wasn't mangled during the build. > > > > > > Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx> > > > --- > > > include/linux/bpf_verifier.h | 4 ++- > > > include/linux/btf.h | 1 + > > > kernel/bpf/btf.c | 66 ++++++++++++++++++++++++++++++++++++ > > > kernel/bpf/verifier.c | 18 ++++++++++ > > > 4 files changed, 88 insertions(+), 1 deletion(-) > > > > > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h > > > index 26a6d58ca78c..432ba8977a0a 100644 > > > --- a/include/linux/bpf_verifier.h > > > +++ b/include/linux/bpf_verifier.h > > > @@ -330,10 +330,12 @@ static inline bool bpf_verifier_log_full(const struct bpf_verifier_log *log) > > > #define BPF_LOG_STATS 4 > > > #define BPF_LOG_LEVEL (BPF_LOG_LEVEL1 | BPF_LOG_LEVEL2) > > > #define BPF_LOG_MASK (BPF_LOG_LEVEL | BPF_LOG_STATS) > > > +#define BPF_LOG_KERNEL (BPF_LOG_MASK + 1) > > > > It's not clear what's the numbering scheme is for these flags. Are > > they independent bits? Only one bit allowed at a time? Only some > > subset of bits allowed? > > E.g., if I specify BPF_LOG_KERNEL an BPF_LOG_STATS, will it work? > > you cannot. It's kernel internal flag. User space cannot pass it in. > That's why it's just +1 and will keep floating up when other flags > are added in the future. > I considered using something really large instead (like ~0), > but it's imo cleaner to define it as max_visible_flag + 1. Ah, I see, maybe small comment, e.g., /* kernel-only flag */ or something along those lines? > > > > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c > > > index 29c7c06c6bd6..848f9d4b9d7e 100644 > > > --- a/kernel/bpf/btf.c > > > +++ b/kernel/bpf/btf.c > > > @@ -698,6 +698,9 @@ __printf(4, 5) static void __btf_verifier_log_type(struct btf_verifier_env *env, > > > if (!bpf_verifier_log_needed(log)) > > > return; > > > > > > + if (log->level == BPF_LOG_KERNEL && !fmt) > > > + return; > > > > This "!fmt" condition is subtle and took me a bit of time to > > understand. Is the intent to print only verification errors for > > BPF_LOG_KERNEL mode? Maybe small comment would help? > > It's the way btf.c prints types. It's calling btf_verifier_log_type(..fmt=NULL). > I need to skip all of these, since they're there to debug invalid BTF > when user space passes it into the kernel. > Here the same code is processing in-kernel trusted BTF and extra messages > are completely unnecessary. > I will add a comment. > > > > > nit: extra empty line here, might as well get rid of it in this change? > > yeah. the empty line was there before. Will remove it. > > > > > > + if (env->log.level == BPF_LOG_KERNEL) > > > + continue; > > > btf_verifier_log(env, "\t%s val=%d\n", > > > __btf_name_by_offset(btf, enums[i].name_off), > > > enums[i].val); > > > @@ -3367,6 +3378,61 @@ static struct btf *btf_parse(void __user *btf_data, u32 btf_data_size, > > > return ERR_PTR(err); > > > } > > > > > > +extern char __weak _binary__btf_vmlinux_bin_start[]; > > > +extern char __weak _binary__btf_vmlinux_bin_end[]; > > > + > > > +struct btf *btf_parse_vmlinux(void) > > > > It's a bit unfortunate to duplicate a bunch of logic of btf_parse() > > here. I assume you considered extending btf_parse() with extra flag > > but decided it's better to have separate vmlinux-specific version? > > Right. It looks similar, but it's 70-80% different. I actually started > with combined, but it didn't look good. > > > > > > > + if (is_priv && !btf_vmlinux) { > > > > I'm missing were you are checking that vmlinux BTF (raw data) is > > present at all? Should this have additional `&& > > _binary__btf_vmlinux_bin_start` check? > > btf_parse_hdr() is doing it. > But now I'm thinking I should gate it with CONFIG_DEBUG_INFO_BTF. You mean btf_data_size check? But in that case you'll get error message printed even though no BTF was generated, so yeah, I guess gating is cleaner. > > > > > > + mutex_lock(&bpf_verifier_lock); > > > + btf_vmlinux = btf_parse_vmlinux(); > > > > This is racy, you might end up parsing vmlinux BTF twice. Check > > `!btf_vmlinux` again under lock? > > right. good catch. > > > > > > > + if (IS_ERR(btf_vmlinux)) { > > > > There is an interesting interplay between non-priviledged BPF and > > corrupted vmlinux. If vmlinux BTF is malformed, but system only ever > > does unprivileged BPF, then we'll never parse vmlinux BTF and won't > > know it's malformed. But once some privileged BPF does parse and > > detect problem, all subsequent unprivileged BPFs will fail due to bad > > BTF, even though they shouldn't use/rely on it. Should something be > > done about this inconsistency? > > I did is_priv check to avoid parsing btf in unpriv, since no unpriv > progs will ever use this stuff.. (not until cpu hw side channels are fixed). > But this inconsistency is indeed bad. > Will refactor to do it always. Sounds good. > Broken in-kernel BTF is bad enough sign that either gcc or pahole or kernel > are broken. In all cases the kernel shouldn't be loading any bpf. > > Thanks for the review! > I'm intending to go over the rest today-tomorrow, so don't post v2 just yet :)