Re: [PATCH] bpf: Fix out-of-bounds write in trie_get_next_key()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 10/22/2024 9:45 AM, Byeonguk Jeong wrote:
> trie_get_next_key() allocates a node stack with size trie->max_prefixlen,
> while it writes (trie->max_prefixlen + 1) nodes to the stack when it has
> full paths from the root to leaves. For example, consider a trie with
> max_prefixlen is 8, and the nodes with key 0x00/0, 0x00/1, 0x00/2, ...
> 0x00/8 inserted. Subsequent calls to trie_get_next_key with _key with
> .prefixlen = 8 make 9 nodes be written on the node stack with size 8.
>
> Fixes: b471f2f1de8b ("bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE map")
> Signed-off-by: Byeonguk Jeong <jungbu2855@xxxxxxxxx>
> ---

Tested-by: Hou Tao <houtao1@xxxxxxxxxx>

Without the fix, there will be KASAN report as show below when dumping
all keys in the lpm-trie through bpf_map_get_next_key().

However, I have a dumb question: does it make sense to reject the
element with prefixlen = 0 ? Because I can't think of a use case where a
zero-length prefix will be useful.


 ==================================================================
 BUG: KASAN: slab-out-of-bounds in trie_get_next_key+0x133/0x530
 Write of size 8 at addr ffff8881076c2fc0 by task test_lpm_trie.b/446

 CPU: 0 UID: 0 PID: 446 Comm: test_lpm_trie.b Not tainted 6.11.0+ #52
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ...
 Call Trace:
  <TASK>
  dump_stack_lvl+0x6e/0xb0
  print_report+0xce/0x610
  ? trie_get_next_key+0x133/0x530
  ? kasan_complete_mode_report_info+0x3c/0x200
  ? trie_get_next_key+0x133/0x530
  kasan_report+0x9c/0xd0
  ? trie_get_next_key+0x133/0x530
  __asan_store8+0x81/0xb0
  trie_get_next_key+0x133/0x530
  __sys_bpf+0x1b03/0x3140
  ? __pfx___sys_bpf+0x10/0x10
  ? __pfx_vfs_write+0x10/0x10
  ? find_held_lock+0x8e/0xb0
  ? ksys_write+0xee/0x180
  ? syscall_exit_to_user_mode+0xb3/0x220
  ? mark_held_locks+0x28/0x90
  ? mark_held_locks+0x28/0x90
  __x64_sys_bpf+0x45/0x60
  x64_sys_call+0x1b2a/0x20d0
  do_syscall_64+0x5d/0x100
  entry_SYSCALL_64_after_hwframe+0x76/0x7e
 RIP: 0033:0x7f9c5e9c9c5d
  ......
  </TASK>
 Allocated by task 446:
  kasan_save_stack+0x28/0x50
  kasan_save_track+0x14/0x30
  kasan_save_alloc_info+0x36/0x40
  __kasan_kmalloc+0x84/0xa0
  __kmalloc_noprof+0x214/0x540
  trie_get_next_key+0xa7/0x530
  __sys_bpf+0x1b03/0x3140
  __x64_sys_bpf+0x45/0x60
  x64_sys_call+0x1b2a/0x20d0
  do_syscall_64+0x5d/0x100
  entry_SYSCALL_64_after_hwframe+0x76/0x7e

 The buggy address belongs to the object at ffff8881076c2f80
  which belongs to the cache kmalloc-rnd-09-64 of size 64
 The buggy address is located 0 bytes to the right of
  allocated 64-byte region [ffff8881076c2f80, ffff8881076c2fc0)

>  kernel/bpf/lpm_trie.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c
> index 0218a5132ab5..9b60eda0f727 100644
> --- a/kernel/bpf/lpm_trie.c
> +++ b/kernel/bpf/lpm_trie.c
> @@ -655,7 +655,7 @@ static int trie_get_next_key(struct bpf_map *map, void *_key, void *_next_key)
>  	if (!key || key->prefixlen > trie->max_prefixlen)
>  		goto find_leftmost;
>  
> -	node_stack = kmalloc_array(trie->max_prefixlen,
> +	node_stack = kmalloc_array(trie->max_prefixlen + 1,
>  				   sizeof(struct lpm_trie_node *),
>  				   GFP_ATOMIC | __GFP_NOWARN);
>  	if (!node_stack)






[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux