Re: [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 21, 2024 at 7:27 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
>
> On Mon, 2024-10-21 at 19:18 -0700, Alexei Starovoitov wrote:
>
> [...]
>
> > >     0: r7 = *(u16 *)(r1 +0);"
> > >     1: r7 += 0x1ab064b9;"
> > >     2: if r7 & 0x702000 goto 1b;
> > >     3: r7 &= 0x1ee60e;"
> > >     4: r7 += r1;"
> > >     5: if r7 s> 0x37d2 goto +0;"
> > >     6: r0 = 0;"
> > >     7: exit;"
>
> [...]
>
> > > And observe verification log:
> > >
> > >     ...
> > >     is_state_visited: new checkpoint at 5, resetting env->jmps_processed
> > >     5: R1=ctx() R7=ctx(...)
> > >     5: (65) if r7 s> 0x37d2 goto pc+0     ; R7=ctx(...)
> > >     6: (b7) r0 = 0                        ; R0_w=0
> > >     7: (95) exit
> > >
> > >     from 5 to 6: R1=ctx() R7=ctx(...) R10=fp0
> > >     6: R1=ctx() R7=ctx(...) R10=fp0
> > >     6: (b7) r0 = 0                        ; R0_w=0
> > >     7: (95) exit
> > >     is_state_visited: suppressing checkpoint at 1, 3 jmps processed, cur->jmp_history_cnt is 74
> > >
> > >     from 2 to 1: R1=ctx() R7_w=scalar(...) R10=fp0
> > >     1: R1=ctx() R7_w=scalar(...) R10=fp0
> > >     1: (07) r7 += 447767737
> > >     is_state_visited: suppressing checkpoint at 2, 3 jmps processed, cur->jmp_history_cnt is 75
> > >     2: R7_w=scalar(...)
> > >     2: (45) if r7 & 0x702000 goto pc-2
> > >     ... mark_precise 152 steps for r7 ...
> > >     2: R7_w=scalar(...)
> > >     is_state_visited: suppressing checkpoint at 1, 4 jmps processed, cur->jmp_history_cnt is 75
> > >     1: (07) r7 += 447767737
> > >     is_state_visited: suppressing checkpoint at 2, 4 jmps processed, cur->jmp_history_cnt is 76
> > >     2: R7_w=scalar(...)
> > >     2: (45) if r7 & 0x702000 goto pc-2
> > >     ...
> > >     BPF program is too large. Processed 257 insn
> > >
> > > The log output shows that checkpoint at label (1) is never created,
> > > because it is suppressed by `skip_inf_loop_check` logic:
> > > a. When 'if' at (2) is processed it pushes a state with insn_idx (1)
> > >    onto stack and proceeds to (3);
> > > b. At (5) checkpoint is created, and this resets
> > >    env->{jmps,insns}_processed.
> > > c. Verification proceeds and reaches `exit`;
> > > d. State saved at step (a) is popped from stack and is_state_visited()
> > >    considers if checkpoint needs to be added, but because
> > >    env->{jmps,insns}_processed had been just reset at step (b)
> > >    the `skip_inf_loop_check` logic forces `add_new_state` to false.
> > > e. Verifier proceeds with current state, which slowly accumulates
> > >    more and more entries in the jump history.
> >
> > I'm still not sure why it grew to thousands of entries in jmp_history.
> > Looking at the above trace jmps_processed grows 1 to 1 with jmp_history_cnt.
> > Also cur->jmp_history_cnt is reset to zero at the same time as
> > jmps processed.
> > So in the above test 75 vs 4 difference came from jmp_history
> > entries that were there before the loop ?
>
>     0: r7 = *(u16 *)(r1 +0);"
>     1: r7 += 0x1ab064b9;"
>     2: if r7 & 0x702000 goto 1b;
>     3: r7 &= 0x1ee60e;"
>     4: r7 += r1;"
>     5: if r7 s> 0x37d2 goto +0;"
>     6: r0 = 0;"
>     7: exit;"
>
> - When 'if' at (2) is processed current state is copied (let's call
>   this copy C), copy is put to the stack for later processing,
>   it's jump history is not cleared.
> - Then current state proceeds verifying 3-5-6-7. At (5) checkpoint is
>   created and env->{jmps,insns}_processed are reset.
> - Then state C is popped from the stack, it goes back to (1) and then (2),
>   at (2) a copy C1 is created but no checkpoint, as env->{jmps,insns}_processed
>   do not meet thresholds. C1's jmp_history is one entry longer then C's.
> - Whole thing repeats until ENOMEM.

I see. Thanks for explaining.
So the bug is actually in reset logic of jmps/insns_processed
coupled with push/pop stack.
I think let's add cur->jmp_history_cnt < 40 check for now
and target bpf tree (assuming no veristat regressions),
but for bpf-next we probably need to follow up.
We can probably remove jmps_processed counter
and replace it with jmp_history_cnt.

Based on the above explanation insns_processed counter is
also bogus.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux