Re: [PATCH bpf-next 2/4] bpf: track find_equal_scalars history on per-instruction level

Eduard Zingerman <eddyz87@xxxxxxxxx> · Thu, 29 Feb 2024 01:29:08 +0200

On Wed, 2024-02-28 at 15:01 -0800, Andrii Nakryiko wrote:
[...]

> > @@ -3332,6 +3402,12 @@ static int push_jmp_history(struct bpf_verifier_env *env, struct bpf_verifier_st
> >                           "verifier insn history bug: insn_idx %d cur flags %x new flags %x\n",
> >                           env->insn_idx, cur_hist_ent->flags, insn_flags);
> >                 cur_hist_ent->flags |= insn_flags;
> > +               if (cur_hist_ent->equal_scalars != 0) {
> > +                       verbose(env, "verifier bug: insn_idx %d equal_scalars != 0: %#llx\n",
> > +                               env->insn_idx, cur_hist_ent->equal_scalars);
> > +                       return -EFAULT;
> > +               }
> 
> let's do WARN_ONCE() just like we do for flags? why deviating?

Ok

[...]

> >  /* For given verifier state backtrack_insn() is called from the last insn to
> > @@ -3802,6 +3917,7 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
> >                          */
> >                         return 0;
> >                 } else if (BPF_SRC(insn->code) == BPF_X) {
> > +                       bt_set_equal_scalars(bt, hist);
> >                         if (!bt_is_reg_set(bt, dreg) && !bt_is_reg_set(bt, sreg))
> >                                 return 0;
> >                         /* dreg <cond> sreg
> > @@ -3812,6 +3928,9 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
> >                          */
> >                         bt_set_reg(bt, dreg);
> >                         bt_set_reg(bt, sreg);
> > +                       bt_set_equal_scalars(bt, hist);
> > +               } else if (BPF_SRC(insn->code) == BPF_K) {
> > +                       bt_set_equal_scalars(bt, hist);
> 
> Can you please elaborate why we are doing bt_set_equal_scalars() in
> these three places and not everywhere else? I'm trying to understand
> whether we should do it more generically for any instruction either
> before or after all the bt_set_xxx() calls...

The before call for BPF_X is for situation when dreg/sreg are not yet
tracked precise but one of the registers that gained range because of
this 'if' is already tracked.

The after call for BPF_X is for situation when say dreg is already
tracked precise but sreg is not and there are some registers had same
id as sreg, that gained range when this 'if' was processed.
The equal_scalars_bpf_x_dst() test case covers this situation.
Here it is for your convenience:

    /* Registers r{0,1,2} share same ID when 'if r1 > r3' insn is processed,
     * check that verifier marks r{0,1,2} as precise while backtracking
     * 'if r1 > r3' with r3 already marked.
     */
    SEC("socket")
    __success __log_level(2)
    __flag(BPF_F_TEST_STATE_FREQ)
    __msg("frame0: regs=r3 stack= before 5: (2d) if r1 > r3 goto pc+0")
    __msg("frame0: parent state regs=r0,r1,r2,r3 stack=:")
    __msg("frame0: regs=r0,r1,r2,r3 stack= before 4: (b7) r3 = 7")
    __naked void equal_scalars_bpf_x_dst(void)
    {
    	asm volatile (
    	/* r0 = random number up to 0xff */
    	"call %[bpf_ktime_get_ns];"
    	"r0 &= 0xff;"
    	/* tie r0.id == r1.id == r2.id */
    	"r1 = r0;"
    	"r2 = r0;"
    	"r3 = 7;"
    	"if r1 > r3 goto +0;"
    	/* force r0 to be precise, this eventually marks r1 and r2 as
    	 * precise as well because of shared IDs
    	 */
    	"r4 = r10;"
    	"r4 += r3;"
    	"r0 = 0;"
    	"exit;"
    	:
    	: __imm(bpf_ktime_get_ns)
    	: __clobber_all);
    }

The before call for BPF_K is the same as before call for BPF_X: for
situation when dreg is not yet tracked precise, but one of the
registers that gained range because of this 'if' is already tracked.

The calls are placed at point where conditional jumps are processed
because 'equal_scalars' are only recorded for conditional jumps.

> 
> >                          /* else dreg <cond> K
> >                           * Only dreg still needs precision before
> >                           * this insn, so for the K-based conditional

[...]