Re: [PATCH bpf-next v9 04/10] bpf: Check potential private stack recursion for progs with async callback

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 4, 2024 at 7:37 PM Yonghong Song <yonghong.song@xxxxxxxxx> wrote:
>
>
> On 11/4/24 6:51 PM, Alexei Starovoitov wrote:
> > On Mon, Nov 4, 2024 at 11:38 AM Yonghong Song <yonghong.song@xxxxxxxxx> wrote:
> >> In previous patch, tracing progs are enabled for private stack since
> >> recursion checking ensures there exists no nested same bpf prog run on
> >> the same cpu.
> >>
> >> But it is still possible for nested bpf subprog run on the same cpu
> >> if the same subprog is called in both main prog and async callback,
> >> or in different async callbacks. For example,
> >>    main_prog
> >>     bpf_timer_set_callback(timer, timer_cb);
> >>     call sub1
> >>    sub1
> >>     ...
> >>    time_cb
> >>     call sub1
> >>
> >> In the above case, nested subprog run for sub1 is possible with one in
> >> process context and the other in softirq context. If this is the case,
> >> the verifier will disable private stack for this bpf prog.
> >>
> >> Signed-off-by: Yonghong Song <yonghong.song@xxxxxxxxx>
> >> ---
> >>   include/linux/bpf_verifier.h |  2 ++
> >>   kernel/bpf/verifier.c        | 42 +++++++++++++++++++++++++++++++-----
> >>   2 files changed, 39 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> >> index 0622c11a7e19..e921589abc72 100644
> >> --- a/include/linux/bpf_verifier.h
> >> +++ b/include/linux/bpf_verifier.h
> >> @@ -669,6 +669,8 @@ struct bpf_subprog_info {
> >>          /* true if bpf_fastcall stack region is used by functions that can't be inlined */
> >>          bool keep_fastcall_stack: 1;
> >>          bool use_priv_stack: 1;
> >> +       bool visited_with_priv_stack_accum: 1;
> >> +       bool visited_with_priv_stack: 1;
> >>
> >>          u8 arg_cnt;
> >>          struct bpf_subprog_arg_info args[MAX_BPF_FUNC_REG_ARGS];
> >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> >> index 406195c433ea..e01b3f0fd314 100644
> >> --- a/kernel/bpf/verifier.c
> >> +++ b/kernel/bpf/verifier.c
> >> @@ -6118,8 +6118,12 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx,
> >>                                          idx, subprog_depth);
> >>                                  return -EACCES;
> >>                          }
> >> -                       if (subprog_depth >= BPF_PRIV_STACK_MIN_SIZE)
> >> +                       if (subprog_depth >= BPF_PRIV_STACK_MIN_SIZE) {
> >>                                  subprog[idx].use_priv_stack = true;
> >> +                               subprog[idx].visited_with_priv_stack = true;
> >> +                       }
> >> +               } else {
> >> +                       subprog[idx].visited_with_priv_stack = true;
> > See suggestion for patch 3.
> > It's cleaner to rewrite with a single visited_with_priv_stack = true; statement.
>
> Ack.
>
> >
> >>                  }
> >>          }
> >>   continue_func:
> >> @@ -6220,10 +6224,12 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx,
> >>   static int check_max_stack_depth(struct bpf_verifier_env *env)
> >>   {
> >>          struct bpf_subprog_info *si = env->subprog_info;
> >> +       enum priv_stack_mode orig_priv_stack_supported;
> >>          enum priv_stack_mode priv_stack_supported;
> >>          int ret, subtree_depth = 0, depth_frame;
> >>
> >>          priv_stack_supported = bpf_enable_priv_stack(env->prog);
> >> +       orig_priv_stack_supported = priv_stack_supported;
> >>
> >>          if (priv_stack_supported != NO_PRIV_STACK) {
> >>                  for (int i = 0; i < env->subprog_cnt; i++) {
> >> @@ -6240,13 +6246,39 @@ static int check_max_stack_depth(struct bpf_verifier_env *env)
> >>                                                              priv_stack_supported);
> >>                          if (ret < 0)
> >>                                  return ret;
> >> +
> >> +                       if (priv_stack_supported != NO_PRIV_STACK) {
> >> +                               for (int j = 0; j < env->subprog_cnt; j++) {
> >> +                                       if (si[j].visited_with_priv_stack_accum &&
> >> +                                           si[j].visited_with_priv_stack) {
> >> +                                               /* si[j] is visited by both main/async subprog
> >> +                                                * and another async subprog.
> >> +                                                */
> >> +                                               priv_stack_supported = NO_PRIV_STACK;
> >> +                                               break;
> >> +                                       }
> >> +                                       if (!si[j].visited_with_priv_stack_accum)
> >> +                                               si[j].visited_with_priv_stack_accum =
> >> +                                                       si[j].visited_with_priv_stack;
> >> +                               }
> >> +                       }
> >> +                       if (priv_stack_supported != NO_PRIV_STACK) {
> >> +                               for (int j = 0; j < env->subprog_cnt; j++)
> >> +                                       si[j].visited_with_priv_stack = false;
> >> +                       }
> > I cannot understand what this algorithm is doing.
> > What is the meaning of visited_with_priv_stack_accum ?
>
> The following is an example to show how the algorithm works.
> Let us say we have prog like
>     main_prog0  si[0]
>       sub1      si[1]
>       sub2      si[2]
>     async1      si[3]
>       sub4      si[4]
>       sub2      si[2]
>     async2      si[5]
>       sub4      si[4]
>       sub5      si[6]
>
>
> Total 9 subprograms.
>
> after iteration 1 (main_prog0)
>     visited_with_priv_stack_accum: si[i] = false for i = 0 ... 9
>     visited_with_priv_stack: si[0] = si[1] = si[2] = true, others false
>
>     for all i, visited_with_priv_stack_accum[i] and visited_with_priv_stack[i]
>     is false, so main_prog0 can use priv stack.
>
>     visited_with_priv_stack_accum: si[0] = si[1] = si[2] = true; others false
>     visited_with_priv_stack cleared with false.
>
> after iteration 2 (async1)
>     visited_with_priv_stack_accum: si[0] = si[1] = si[2] = true; others false
>     visited_with_priv_stack: si[2] = si[3] = si[4] = true, others false
>
>     Here, si[2] appears in both visited_with_priv_stack_accum and
>     visited_with_priv_stack, so async1 cannot have priv stack.
>
>     In my algorithm, I flipped the whole thing to no_priv_stack, which is
>     too conservative. We should just skip async1 and continues.
>
>     Let us say, we say async1 not having priv stack while main_prog0 has.
>
>     /* the same as end of iteration 1 */
>     visited_with_priv_stack_accum: si[0] = si[1] = si[2] = true; others false
>     visited_with_priv_stack cleared with false.
>
> after iteration 3 (async2)
>     visited_with_priv_stack_accum: si[0] = si[1] = si[2] = true; others false
>     visited_with_priv_stack: si[4] = si[5] = si[6] = true;
>
>     there are no conflict, so async2 can use private stack.
>
>
> If we only have one bit in bpf_subprog_info, for a async tree,
> if marking a subprog to be true and later we found there is a conflict in
> async tree and we need make the whole async subprogs not eligible for priv stack,
> then it will be hard to undo previous markings.
>
> So visited_with_priv_stack_accum is to accumulate "true" results from
> main_prog/async's.

I see. I think it works, but feels complicated.
It feels it should be possible to do without extra flags. Like
check_max_stack_depth_subprog() will know whether it was called
to verify async_cb or not.
So it's just a matter of adding single 'if' to it:
if (subprog[idx].use_priv_stack && checking_async_cb)
   /* reset to false due to potential recursion */
   subprog[idx].use_priv_stack = false;

check_max_stack_depth() starts with i==0,
so reachable and eligible subprogs will be marked with use_priv_stack.
Then check_max_stack_depth_subprog() will be called again
to verify async. If it sees the mark it's a bad case.
what am I missing?





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux