Re: [PATCH bpf-next v3 3/4] bpf: verify scalar ids mapping in regsafe() using check_ids()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2023-06-07 at 14:40 -0700, Andrii Nakryiko wrote:
> On Tue, Jun 6, 2023 at 3:24 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
> > 
> > Make sure that the following unsafe example is rejected by verifier:
> > 
> > 1: r9 = ... some pointer with range X ...
> > 2: r6 = ... unbound scalar ID=a ...
> > 3: r7 = ... unbound scalar ID=b ...
> > 4: if (r6 > r7) goto +1
> > 5: r6 = r7
> > 6: if (r6 > X) goto ...
> > --- checkpoint ---
> > 7: r9 += r7
> > 8: *(u64 *)r9 = Y
> > 
> > This example is unsafe because not all execution paths verify r7 range.
> > Because of the jump at (4) the verifier would arrive at (6) in two states:
> > I.  r6{.id=b}, r7{.id=b} via path 1-6;
> > II. r6{.id=a}, r7{.id=b} via path 1-4, 6.
> > 
> > Currently regsafe() does not call check_ids() for scalar registers,
> > thus from POV of regsafe() states (I) and (II) are identical. If the
> > path 1-6 is taken by verifier first, and checkpoint is created at (6)
> > the path [1-4, 6] would be considered safe.
> > 
> > This commit updates regsafe() to call check_ids() for precise scalar
> > registers.
> > 
> > To minimize the impact on verification performance, avoid generating
> > bpf_reg_state::id for constant scalar values when processing BPF_MOV
> > in check_alu_op(). Scalar IDs are utilized by find_equal_scalars() to
> > propagate information about value ranges for registers that hold the
> > same value. However, there is no need to propagate range information
> > for constants.
> > 
> > Still, there is some performance impact because of this change.
> > Using veristat to compare number of processed states for selftests
> > object files listed in tools/testing/selftests/bpf/veristat.cfg and
> > Cilium object files from [1] gives the following statistics:
> > 
> > $ ./veristat -e file,prog,states -f "states_pct>10" \
> >     -C master-baseline.log current.log
> > File         Program                         States  (DIFF)
> > -----------  ------------------------------  --------------
> > bpf_xdp.o    tail_handle_nat_fwd_ipv6        +155 (+23.92%)
> > bpf_xdp.o    tail_nodeport_nat_ingress_ipv4  +102 (+27.20%)
> > bpf_xdp.o    tail_rev_nodeport_lb4            +83 (+20.85%)
> > loop6.bpf.o  trace_virtqueue_add_sgs          +25 (+11.06%)
> > 
> > Also test case verifier_search_pruning/allocated_stack has to be
> > updated to avoid conflicts in register ID assignments between cached
> > and new states.
> > 
> > [1] git@xxxxxxxxxx:anakryiko/cilium.git
> > 
> > Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.")
> > Signed-off-by: Eduard Zingerman <eddyz87@xxxxxxxxx>
> > ---
> 
> So I checked it also on our internal BPF object files, and it looks
> mostly good. Here are the only regressions:
> 
> Program                                   States (A)  States (B)
> States   (DIFF)
> ----------------------------------------  ----------  ----------
> ---------------
> balancer_ingress                               29219       34531
> +5312 (+18.18%)
> syar_bind6_protect6                             3257        3599
> +342 (+10.50%)
> syar_bind4_protect4                             2590        2931
> +341 (+13.17%)
> on_alloc                                         415         526
> +111 (+26.75%)
> on_free                                          406         517
> +111 (+27.34%)
> pycallcount                                      395         506
> +111 (+28.10%)
> resume_context                                   405         516
> +111 (+27.41%)
> on_py_event                                      395         506
> +111 (+28.10%)
> on_event                                         284         394
> +110 (+38.73%)
> handle_cuda_event                                268         378
> +110 (+41.04%)
> handle_cuda_launch                               276         386
> +110 (+39.86%)
> handle_cuda_malloc_ret                           272         382
> +110 (+40.44%)
> handle_cuda_memcpy                               270         380
> +110 (+40.74%)
> handle_cuda_memcpy_async                         270         380
> +110 (+40.74%)
> handle_pytorch_allocate_ret                      271         381
> +110 (+40.59%)
> handle_pytorch_malloc_ret                        272         382
> +110 (+40.44%)
> on_event                                         284         394
> +110 (+38.73%)
> on_event                                         284         394
> +110 (+38.73%)
> syar_task_enter_execve                           309         329
> +20 (+6.47%)
> kprobe__security_inode_link                      968         986
> +18 (+1.86%)
> kprobe__security_inode_symlink                   838         854
> +16 (+1.91%)
> tw_twfw_egress                                   249         251
> +2 (+0.80%)
> tw_twfw_ingress                                  250         252
> +2 (+0.80%)
> tw_twfw_tc_eg                                    248         250
> +2 (+0.81%)
> tw_twfw_tc_in                                    250         252
> +2 (+0.80%)
> raw_tracepoint__sched_process_exec               136         139
> +3 (+2.21%)
> kprobe_ret__do_filp_open                         869         871
> +2 (+0.23%)
> read_erlang_stack                                572         573
> +1 (+0.17%)
> 
> 
> They are mostly on small-ish programs. The only mild concern from my
> side is balancer_ingress, which is one of Katran BPF programs. It add
> +18% of states (which translates to about 70K more instructions
> verified, up from 350K). I think we can live with this, but would be
> nice to check why it's happening.

Thank you for reviewing this series.

I looked at the logs that you've shared, the difference is indeed
caused by some scalar registers having a unique ID in cached state and
no ID in current state or vice versa. The !rold->id trick that we
discussed for V2 helps :)

What do you think about an alternative way to exclude unique scalars
as in the patch below? (on top of this patch-set):

--- 8< -------------------------

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 235d7eded565..ece9722dff3b 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -15149,6 +15149,13 @@ static bool check_ids(u32 old_id, u32 cur_id, struct bpf_id_pair *idmap)
        return false;
 }
 
+static bool check_scalar_ids(u32 old_id, u32 cur_id, struct bpf_verifier_env *env)
+{
+       old_id = old_id ? old_id : env->id_gen++;
+       cur_id = cur_id ? cur_id : env->id_gen++;
+       return check_ids(old_id, cur_id, env->idmap_scratch);
+}
+
 static void clean_func_state(struct bpf_verifier_env *env,
                             struct bpf_func_state *st)
 {
@@ -15325,7 +15332,7 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
                 */
                return range_within(rold, rcur) &&
                       tnum_in(rold->var_off, rcur->var_off) &&
-                      check_ids(rold->id, rcur->id, idmap);
+                      check_scalar_ids(rold->id, rcur->id, env);
        case PTR_TO_MAP_KEY:
        case PTR_TO_MAP_VALUE:
        case PTR_TO_MEM:

------------------------- >8 ---

For me this patch removes all veristat differences compared to the
master. If doing it for real, I'd like to reset env->id_gen at exit
from states_equal() to the value it had at entry (to avoid allocating
too many ids).

> 
> I suspect that dropping SCALAR IDs as we discussed (after fixing
> register fill/spill ID generation) might completely mitigate that.
> 
> Overall, LGTM:
> 
> Acked-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
> 
> >  kernel/bpf/verifier.c                         | 34 ++++++++++++++++---
> >  .../bpf/progs/verifier_search_pruning.c       |  3 +-
> >  2 files changed, 32 insertions(+), 5 deletions(-)
> > 
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 2aa60b73f1b5..175ca22b868e 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -12933,12 +12933,14 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
> >                 if (BPF_SRC(insn->code) == BPF_X) {
> >                         struct bpf_reg_state *src_reg = regs + insn->src_reg;
> >                         struct bpf_reg_state *dst_reg = regs + insn->dst_reg;
> > +                       bool need_id = (src_reg->type == SCALAR_VALUE && !src_reg->id &&
> > +                                       !tnum_is_const(src_reg->var_off));
> > 
> 
> nit: unnecessary outer ()
> 
> >                         if (BPF_CLASS(insn->code) == BPF_ALU64) {
> >                                 /* case: R1 = R2
> >                                  * copy register state to dest reg
> >                                  */
> > -                               if (src_reg->type == SCALAR_VALUE && !src_reg->id)
> > +                               if (need_id)
> >                                         /* Assign src and dst registers the same ID
> >                                          * that will be used by find_equal_scalars()
> >                                          * to propagate min/max range.
> 
> [...]






[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux