Re: [PATCH bpf-next v6 1/3] bpf: Prevent tailcall infinite loop caused by freplace

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 10/10/24 08:58, Alexei Starovoitov wrote:
> On Tue, Oct 8, 2024 at 9:16 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote:
>>
>> This patch prevents an infinite loop issue caused by combination of tailcall
>> and freplace.
>>
>> For example:
>>
>> tc_bpf2bpf.c:
>>
>> // SPDX-License-Identifier: GPL-2.0
>>
>> \#include <linux/bpf.h>
>> \#include <bpf/bpf_helpers.h>
>>
>> __noinline
>> int subprog_tc(struct __sk_buff *skb)
>> {
>>         return skb->len * 2;
>> }
>>
>> SEC("tc")
>> int entry_tc(struct __sk_buff *skb)
>> {
>>         return subprog_tc(skb);
>> }
>>
>> char __license[] SEC("license") = "GPL";
>>
>> tailcall_freplace.c:
>>
>> // SPDX-License-Identifier: GPL-2.0
>>
>> \#include <linux/bpf.h>
>> \#include <bpf/bpf_helpers.h>
>>
>> struct {
>>         __uint(type, BPF_MAP_TYPE_PROG_ARRAY);
>>         __uint(max_entries, 1);
>>         __uint(key_size, sizeof(__u32));
>>         __uint(value_size, sizeof(__u32));
>> } jmp_table SEC(".maps");
>>
>> int count = 0;
>>
>> SEC("freplace")
>> int entry_freplace(struct __sk_buff *skb)
>> {
>>         count++;
>>         bpf_tail_call_static(skb, &jmp_table, 0);
>>         return count;
>> }
>>
>> char __license[] SEC("license") = "GPL";
> 
> No need to put the whole selftest into commit log.
> 
> The below few paragraphs are enough:
> 
>> The attach target of entry_freplace is subprog_tc, and the tail callee
>> in entry_freplace is entry_tc.
>>
>> Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace
>> --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count
>> from zero for every time of entry_freplace execution. Kernel will panic,
>> like:
> 
> 99% of this full dump below is nothing but noise.
> Drop it. Above "kernel will panic" is enough.
> 
>> [   15.310490] BUG: TASK stack guard page was hit at (____ptrval____)
>> (stack is (____ptrval____)..(____ptrval____))
>> [   15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI
>> [   15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G           OE
>>    6.10.0-rc6-g026dcdae8d3e-dirty #72
>> [   15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX,
>> 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
>> [   15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53
>> [   15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
>> cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e
>> fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c
>> [   15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202
>> [   15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX:
>> 0000000000008cb5
>> [   15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI:
>> ffff9c98808b7e00
>> [   15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09:
>> 0000000000000000
>> [   15.310490] R10: 0000000000000001 R11: 0000000000000000 R12:
>> ffffb500c01af000
>> [   15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15:
>> 0000000000000000
>> [   15.310490] FS:  00007f133b665140(0000) GS:ffff9c98bbd00000(0000)
>> knlGS:0000000000000000
>> [   15.310490] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [   15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4:
>> 00000000000006f0
>> [   15.310490] Call Trace:
>> [   15.310490]  <#DF>
>> [   15.310490]  ? die+0x36/0x90
>> [   15.310490]  ? handle_stack_overflow+0x4d/0x60
>> [   15.310490]  ? exc_double_fault+0x117/0x1a0
>> [   15.310490]  ? asm_exc_double_fault+0x23/0x30
>> [   15.310490]  ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53
>> [   15.310490]  </#DF>
>> [   15.310490]  <TASK>
>> [   15.310490]  bpf_prog_85781a698094722f_entry+0x4c/0x64
>> [   15.310490]  bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b
>> [   15.310490]  ...
>> [   15.310490]  bpf_prog_85781a698094722f_entry+0x4c/0x64
>> [   15.310490]  bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b
>> [   15.310490]  bpf_test_run+0x210/0x370
>> [   15.310490]  ? bpf_test_run+0x128/0x370
>> [   15.310490]  bpf_prog_test_run_skb+0x388/0x7a0
>> [   15.310490]  __sys_bpf+0xdbf/0x2c40
>> [   15.310490]  ? clockevents_program_event+0x52/0xf0
>> [   15.310490]  ? lock_release+0xbf/0x290
>> [   15.310490]  __x64_sys_bpf+0x1e/0x30
>> [   15.310490]  do_syscall_64+0x68/0x140
>> [   15.310490]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> [   15.310490] RIP: 0033:0x7f133b52725d
>> [   15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa
>> 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f
>> 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48
>> [   15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX:
>> 0000000000000141
>> [   15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX:
>> 00007f133b52725d
>> [   15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI:
>> 000000000000000a
>> [   15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09:
>> 00007ffddbc102a0
>> [   15.310490] R10: 0000000000000064 R11: 0000000000000206 R12:
>> 0000000000000004
>> [   15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15:
>> 00007f133b6ed000
>> [   15.310490]  </TASK>
>> [   15.310490] Modules linked in: bpf_testmod(OE)
>> [   15.310490] ---[ end trace 0000000000000000 ]---
>> [   15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53
>> [   15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
>> cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e
>> fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c
>> [   15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202
>> [   15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX:
>> 0000000000008cb5
>> [   15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI:
>> ffff9c98808b7e00
>> [   15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09:
>> 0000000000000000
>> [   15.310490] R10: 0000000000000001 R11: 0000000000000000 R12:
>> ffffb500c01af000
>> [   15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15:
>> 0000000000000000
>> [   15.310490] FS:  00007f133b665140(0000) GS:ffff9c98bbd00000(0000)
>> knlGS:0000000000000000
>> [   15.310490] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [   15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4:
>> 00000000000006f0
>> [   15.310490] Kernel panic - not syncing: Fatal exception in interrupt
>> [   15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000
>> (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> 
> 
> Below paragraphs are almost ok, but pls use imperative language.
> Instead of "this patch prevents ..."
> say "Fix the issue by preventing update of ext prog .."
> 
> Consider using grammarly or chatgpt to make it less nerdy.
> 

Got it. Thanks for your suggestion.

>> This patch prevents this panic by preventing updating extended prog to
>> prog_array map and preventing extending a prog, which has been updated
>> to prog_array map, with freplace prog.
>>
>> If a prog or its subprog has been extended by freplace prog, the prog
>> can not be updated to prog_array map.
>>
>> If a prog has been updated to prog_array map, it or its subprog can not
>> be extended by freplace prog.
>>
>> BTW, fix a minor code style issue by replacing 8 spaces with a tab.
>>
>> Reported-by: kernel test robot <lkp@xxxxxxxxx>
>> Closes: https://lore.kernel.org/oe-kbuild-all/202410080455.vy5GT8Vz-lkp@xxxxxxxxx/
>> Signed-off-by: Leon Hwang <leon.hwang@xxxxxxxxx>
>> ---
>>  include/linux/bpf.h     | 21 ++++++++++++++++++++
>>  kernel/bpf/arraymap.c   | 23 +++++++++++++++++++++-
>>  kernel/bpf/core.c       |  1 +
>>  kernel/bpf/syscall.c    | 21 ++++++++++++++------
>>  kernel/bpf/trampoline.c | 43 +++++++++++++++++++++++++++++++++++++++++
>>  5 files changed, 102 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
>> index 19d8ca8ac960f..213a68c59bdf7 100644
>> --- a/include/linux/bpf.h
>> +++ b/include/linux/bpf.h
>> @@ -1294,6 +1294,12 @@ bool __bpf_dynptr_is_rdonly(const struct bpf_dynptr_kern *ptr);
>>  #ifdef CONFIG_BPF_JIT
>>  int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr);
>>  int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr);
>> +int bpf_extension_link_prog(struct bpf_tramp_link *link,
>> +                           struct bpf_trampoline *tr,
>> +                           struct bpf_prog *tgt_prog);
>> +int bpf_extension_unlink_prog(struct bpf_tramp_link *link,
>> +                             struct bpf_trampoline *tr,
>> +                             struct bpf_prog *tgt_prog);
>>  struct bpf_trampoline *bpf_trampoline_get(u64 key,
>>                                           struct bpf_attach_target_info *tgt_info);
>>  void bpf_trampoline_put(struct bpf_trampoline *tr);
>> @@ -1383,6 +1389,18 @@ static inline int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link,
>>  {
>>         return -ENOTSUPP;
>>  }
>> +static inline int bpf_extension_link_prog(struct bpf_tramp_link *link,
>> +                           struct bpf_trampoline *tr,
>> +                           struct bpf_prog *tgt_prog)
>> +{
>> +       return -ENOTSUPP;
>> +}
>> +static inline int bpf_extension_unlink_prog(struct bpf_tramp_link *link,
>> +                             struct bpf_trampoline *tr,
>> +                             struct bpf_prog *tgt_prog)
>> +{
>> +       return -ENOTSUPP;
>> +}
>>  static inline struct bpf_trampoline *bpf_trampoline_get(u64 key,
>>                                                         struct bpf_attach_target_info *tgt_info)
>>  {
>> @@ -1483,6 +1501,9 @@ struct bpf_prog_aux {
>>         bool xdp_has_frags;
>>         bool exception_cb;
>>         bool exception_boundary;
>> +       bool is_extended; /* true if extended by freplace program */
>> +       u64 prog_array_member_cnt; /* counts how many times as member of prog_array */
>> +       struct mutex ext_mutex; /* mutex for is_extended and prog_array_member_cnt */
>>         struct bpf_arena *arena;
>>         /* BTF_KIND_FUNC_PROTO for valid attach_btf_id */
>>         const struct btf_type *attach_func_proto;
>> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
>> index 79660e3fca4c1..f9bd63a74eee7 100644
>> --- a/kernel/bpf/arraymap.c
>> +++ b/kernel/bpf/arraymap.c
>> @@ -947,6 +947,7 @@ static void *prog_fd_array_get_ptr(struct bpf_map *map,
>>                                    struct file *map_file, int fd)
>>  {
>>         struct bpf_prog *prog = bpf_prog_get(fd);
>> +       bool is_extended;
>>
>>         if (IS_ERR(prog))
>>                 return prog;
>> @@ -956,13 +957,33 @@ static void *prog_fd_array_get_ptr(struct bpf_map *map,
>>                 return ERR_PTR(-EINVAL);
>>         }
>>
>> +       mutex_lock(&prog->aux->ext_mutex);
>> +       is_extended = prog->aux->is_extended;
>> +       if (!is_extended)
>> +               prog->aux->prog_array_member_cnt++;
>> +       mutex_unlock(&prog->aux->ext_mutex);
>> +       if (is_extended) {
>> +               /* Extended prog can not be tail callee. It's to prevent a
>> +                * potential infinite loop like:
>> +                * tail callee prog entry -> tail callee prog subprog ->
>> +                * freplace prog entry --tailcall-> tail callee prog entry.
>> +                */
>> +               bpf_prog_put(prog);
>> +               return ERR_PTR(-EBUSY);
>> +       }
>> +
>>         return prog;
>>  }
>>
>>  static void prog_fd_array_put_ptr(struct bpf_map *map, void *ptr, bool need_defer)
>>  {
>> +       struct bpf_prog *prog = ptr;
>> +
>> +       mutex_lock(&prog->aux->ext_mutex);
>> +       prog->aux->prog_array_member_cnt--;
>> +       mutex_unlock(&prog->aux->ext_mutex);
>>         /* bpf_prog is freed after one RCU or tasks trace grace period */
>> -       bpf_prog_put(ptr);
>> +       bpf_prog_put(prog);
>>  }
>>
>>  static u32 prog_fd_array_sys_lookup_elem(void *ptr)
>> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
>> index 5e77c58e06010..233ea78f8f1bd 100644
>> --- a/kernel/bpf/core.c
>> +++ b/kernel/bpf/core.c
>> @@ -131,6 +131,7 @@ struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flag
>>         INIT_LIST_HEAD_RCU(&fp->aux->ksym_prefix.lnode);
>>  #endif
>>         mutex_init(&fp->aux->used_maps_mutex);
>> +       mutex_init(&fp->aux->ext_mutex);
>>         mutex_init(&fp->aux->dst_mutex);
>>
>>         return fp;
>> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
>> index a8f1808a1ca54..4a5a44bbb5f50 100644
>> --- a/kernel/bpf/syscall.c
>> +++ b/kernel/bpf/syscall.c
>> @@ -3212,15 +3212,21 @@ static void bpf_tracing_link_release(struct bpf_link *link)
>>  {
>>         struct bpf_tracing_link *tr_link =
>>                 container_of(link, struct bpf_tracing_link, link.link);
>> +       struct bpf_prog *tgt_prog = tr_link->tgt_prog;
>>
>> -       WARN_ON_ONCE(bpf_trampoline_unlink_prog(&tr_link->link,
>> -                                               tr_link->trampoline));
>> +       if (link->prog->type == BPF_PROG_TYPE_EXT)
>> +               WARN_ON_ONCE(bpf_extension_unlink_prog(&tr_link->link,
>> +                                                      tr_link->trampoline,
>> +                                                      tgt_prog));
>> +       else
>> +               WARN_ON_ONCE(bpf_trampoline_unlink_prog(&tr_link->link,
>> +                                                       tr_link->trampoline));
> 
> This early demux into two helpers looks odd.
> I think it would be cleaner to add this logic to bpf_trampoline_unlink_prog()
> Just extend it with an extra prog argument.
> 

No problem. I can extend bpf_trampoline_unlink_prog().

>>         bpf_trampoline_put(tr_link->trampoline);
>>
>>         /* tgt_prog is NULL if target is a kernel function */
>> -       if (tr_link->tgt_prog)
>> -               bpf_prog_put(tr_link->tgt_prog);
>> +       if (tgt_prog)
>> +               bpf_prog_put(tgt_prog);
>>  }
>>
>>  static void bpf_tracing_link_dealloc(struct bpf_link *link)
>> @@ -3354,7 +3360,7 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog,
>>          *   in prog->aux
>>          *
>>          * - if prog->aux->dst_trampoline is NULL, the program has already been
>> -         *   attached to a target and its initial target was cleared (below)
>> +        *   attached to a target and its initial target was cleared (below)
>>          *
>>          * - if tgt_prog != NULL, the caller specified tgt_prog_fd +
>>          *   target_btf_id using the link_create API.
>> @@ -3429,7 +3435,10 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog,
>>         if (err)
>>                 goto out_unlock;
>>
>> -       err = bpf_trampoline_link_prog(&link->link, tr);
>> +       if (prog->type == BPF_PROG_TYPE_EXT)
>> +               err = bpf_extension_link_prog(&link->link, tr, tgt_prog);
>> +       else
>> +               err = bpf_trampoline_link_prog(&link->link, tr);
> 
> same here.
> 

Ack.

>>         if (err) {
>>                 bpf_link_cleanup(&link_primer);
>>                 link = NULL;
>> diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
>> index f8302a5ca400d..b14f56046ad4e 100644
>> --- a/kernel/bpf/trampoline.c
>> +++ b/kernel/bpf/trampoline.c
>> @@ -580,6 +580,35 @@ int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline
>>         return err;
>>  }
>>
>> +int bpf_extension_link_prog(struct bpf_tramp_link *link,
>> +                           struct bpf_trampoline *tr,
>> +                           struct bpf_prog *tgt_prog)
>> +{
>> +       struct bpf_prog_aux *aux = tgt_prog->aux;
>> +       int err;
>> +
>> +       mutex_lock(&aux->ext_mutex);
> 
> pls use guard(mutex) as I suggested earlier.
> 

Thanks very much to suggest it again.

guard(mutex) is new for me. And I can learn how to use it from
guard(mutex)(&arena->lock).

>> +       if (aux->prog_array_member_cnt) {
>> +               /* Program extensions can not extend target prog when the target
>> +                * prog has been updated to any prog_array map as tail callee.
>> +                * It's to prevent a potential infinite loop like:
>> +                * tgt prog entry -> tgt prog subprog -> freplace prog entry
>> +                * --tailcall-> tgt prog entry.
>> +                */
>> +               err = -EBUSY;
>> +               goto out_unlock;
>> +       }
>> +
>> +       err = bpf_trampoline_link_prog(link, tr);
>> +       if (err)
>> +               goto out_unlock;
>> +
>> +       aux->is_extended = true;
>> +out_unlock:
>> +       mutex_unlock(&aux->ext_mutex);
>> +       return err;
>> +}
>> +
>>  static int __bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
>>  {
>>         enum bpf_tramp_prog_type kind;
>> @@ -609,6 +638,20 @@ int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampolin
>>         return err;
>>  }
>>
>> +int bpf_extension_unlink_prog(struct bpf_tramp_link *link,
>> +                             struct bpf_trampoline *tr,
>> +                             struct bpf_prog *tgt_prog)
>> +{
>> +       struct bpf_prog_aux *aux = tgt_prog->aux;
>> +       int err;
>> +
>> +       mutex_lock(&aux->ext_mutex);
>> +       err = bpf_trampoline_unlink_prog(link, tr);
>> +       aux->is_extended = false;
>> +       mutex_unlock(&aux->ext_mutex);
> 
> same suggestions.
> 

Ack.

> pw-bot: cr

Thanks,
Leon





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux