On Thu, Oct 19, 2023 at 05:25:26PM -0700, Alexei Starovoitov wrote: > On Thu, Oct 12, 2023 at 8:28 PM Shung-Hsi Yu <shung-hsi.yu@xxxxxxxx> wrote: > > On Thu, Oct 12, 2023 at 08:02:00AM -0700, Alexei Starovoitov wrote: > > > On Thu, Oct 12, 2023 at 1:14 AM Shung-Hsi Yu <shung-hsi.yu@xxxxxxxx> wrote: > > > > On Wed, Oct 11, 2023 at 06:38:56AM -0700, Alexei Starovoitov wrote: > > > > > On Wed, Oct 11, 2023 at 2:01 AM Hao Sun <sunhao.th@xxxxxxxxx> wrote: > > > > > > > > > > > > Currently, we don't check if the branch-taken of a jump is reserved code of > > > > > > ld_imm64. Instead, such a issue is captured in check_ld_imm(). The verifier > > > > > > gives the following log in such case: > > > > > > > > > > > > func#0 @0 > > > > > > 0: R1=ctx(off=0,imm=0) R10=fp0 > > > > > > 0: (18) r4 = 0xffff888103436000 ; R4_w=map_ptr(off=0,ks=4,vs=128,imm=0) > > > > > > 2: (18) r1 = 0x1d ; R1_w=29 > > > > > > 4: (55) if r4 != 0x0 goto pc+4 ; R4_w=map_ptr(off=0,ks=4,vs=128,imm=0) > > > > > > 5: (1c) w1 -= w1 ; R1_w=0 > > > > > > 6: (18) r5 = 0x32 ; R5_w=50 > > > > > > 8: (56) if w5 != 0xfffffff4 goto pc-2 > > > > > > mark_precise: frame0: last_idx 8 first_idx 0 subseq_idx -1 > > > > > > mark_precise: frame0: regs=r5 stack= before 6: (18) r5 = 0x32 > > > > > > 7: R5_w=50 > > > > > > 7: BUG_ld_00 > > > > > > invalid BPF_LD_IMM insn > > > > > > > > > > > > Here the verifier rejects the program because it thinks insn at 7 is an > > > > > > invalid BPF_LD_IMM, but such a error log is not accurate since the issue > > > > > > is jumping to reserved code not because the program contains invalid insn. > > > > > > Therefore, make the verifier check the jump target during check_cfg(). For > > > > > > the same program, the verifier reports the following log: > > > > > > > > > > > > func#0 @0 > > > > > > jump to reserved code from insn 8 to 7 > > > > > > > > > > > > Signed-off-by: Hao Sun <sunhao.th@xxxxxxxxx> > > > > > > --- > > > > > > kernel/bpf/verifier.c | 7 +++++++ > > > > > > 1 file changed, 7 insertions(+) > > > > > > > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > > > > > index eed7350e15f4..725ac0b464cf 100644 > > > > > > --- a/kernel/bpf/verifier.c > > > > > > +++ b/kernel/bpf/verifier.c > > > > > > @@ -14980,6 +14980,7 @@ static int push_insn(int t, int w, int e, struct bpf_verifier_env *env, > > > > > > { > > > > > > int *insn_stack = env->cfg.insn_stack; > > > > > > int *insn_state = env->cfg.insn_state; > > > > > > + struct bpf_insn *insns = env->prog->insnsi; > > > > > > > > > > > > if (e == FALLTHROUGH && insn_state[t] >= (DISCOVERED | FALLTHROUGH)) > > > > > > return DONE_EXPLORING; > > > > > > @@ -14993,6 +14994,12 @@ static int push_insn(int t, int w, int e, struct bpf_verifier_env *env, > > > > > > return -EINVAL; > > > > > > } > > > > > > > > > > > > + if (e == BRANCH && insns[w].code == 0) { > > > > > > + verbose_linfo(env, t, "%d", t); > > > > > > + verbose(env, "jump to reserved code from insn %d to %d\n", t, w); > > > > > > + return -EINVAL; > > > > > > + } > > > > > > > > > > I don't think we should be changing the verifier to make > > > > > fuzzer logs more readable. > > > > > > > > Taking fuzzer out of consideration, giving users clearer explanation for > > > > such verifier rejection could save a lot of head scratching. > > > > > > Users won't see such errors unless they are actively doing what > > > is not recommended. > > > > > > > Compiler shouldn't generate such program, but its plausible to forget to > > > > account that BPF_LD_IMM64 consists of two instructions when writing > > > > assembly (especially with filter.h-like macros) and have it jump to the 2nd > > > > part of BPF_LD_IMM64. > > > > > > Using macros to write bpf asm code is highly discouraged. > > > All kinds of errors are possible. > > > Bogus jump is just one of such mistakes. > > > Use naked functions and inline asm in C code that > > > both GCC and clang understand then you won't see bad jumps. > > > See selftets/bpf/verifier_*.c as an example. > > > > Understood, thanks for the explanation! > > > > Found them under progs/verifier_*.c inside the bpf selftest directory. > > > > > > > Same with patch 2. The code is fine as-is. > > > > > > > > The only way BPF_SIZE(insn->code) != BPF_DW conditional in check_ld_imm() > > > > can be met right now is when we have a jump to the 2nd part of LD_IMM64; but > > > > what this conditional actually guard against is not straight-forward and > > > > quite confusing[1]. > > > > > > There are plenty of cases in the verifier where we print > > > an error message. Some of them should be impossible due > > > to prior checks. In such cases we don't yell "verifier bug" > > > and are not going to do that in this case either. > > > > I agree, without patch 1 applied, the change to "verfier bug" in patch 2 > > doesn't make sense and is just wrong. The point I'm trying to make is that > > the checks done by verifier are generally clear, you can make sense of why > > certain check are in place just by looking at the code, but > > BPF_SIZE(insn->code) != BPF_DW is _not_ one of them. > > > > I got confused, (reading between the lines I believe) this had Hao puzzled, > > and even Yongsong had to look twice[1] back then; so this check is certainly > > not on-par with others we have in the verifier in terms of clarity, which > > leads to patches here as well as mine a while back. > > > > Perhaps we could reconsider making it more obvious how verifier prevents > > jump to reserved code/2nd instruction of LD_IMM64? > > I agree that the message is confusing. > My point is that people see it only when they code in asm with macros. > Anyone who was doing that a lot saw that message and probably debugged > much worse issues while inserting an asm macro and forgetting to > adjust constants in branches. The code might even load, but will > execute something totally different. > asm macros are a nightmare to debug. Adding more code to the verifier > to help with one particular case is not going to help much. > Use inline asm in C is the right answer for folks that still need asm. > > UX of the verifier sucks and we need to improve. So please focus on impactful > improvements instead of hacking on niche cases. Ok, can't say I agree entirely, but it's a niche case alright, and I'll leave this alone.