Re: [PATCH bpf-next 2/2] fork: Rename mm_init to task_mm_init

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 25, 2023 at 5:13 AM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Mon, Apr 24, 2023 at 9:12 AM Yafang Shao <laoar.shao@xxxxxxxxx> wrote:
> >
> > The kernel will panic as follows when attaching fexit to mm_init,
> >
> > [   86.549700] ------------[ cut here ]------------
> > [   86.549712] BUG: kernel NULL pointer dereference, address: 0000000000000078
> > [   86.549713] #PF: supervisor read access in kernel mode
> > [   86.549715] #PF: error_code(0x0000) - not-present page
> > [   86.549716] PGD 10308f067 P4D 10308f067 PUD 11754e067 PMD 0
> > [   86.549719] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > [   86.549722] CPU: 9 PID: 9829 Comm: main_amd64 Kdump: loaded Not tainted 6.3.0-rc6+ #12
> > [   86.549725] RIP: 0010:check_preempt_wakeup+0xd1/0x310
> > [   86.549754] Call Trace:
> > [   86.549755]  <TASK>
> > [   86.549757]  check_preempt_curr+0x5e/0x70
> > [   86.549761]  ttwu_do_activate+0xab/0x350
> > [   86.549763]  try_to_wake_up+0x314/0x680
> > [   86.549765]  wake_up_process+0x15/0x20
> > [   86.549767]  insert_work+0xb2/0xd0
> > [   86.549772]  __queue_work+0x20a/0x400
> > [   86.549774]  queue_work_on+0x7b/0x90
> > [   86.549778]  drm_fb_helper_sys_imageblit+0xd7/0xf0 [drm_kms_helper]
> > [   86.549801]  drm_fbdev_fb_imageblit+0x5b/0xb0 [drm_kms_helper]
> > [   86.549813]  soft_cursor+0x1cb/0x250
> > [   86.549816]  bit_cursor+0x3ce/0x630
> > [   86.549818]  fbcon_cursor+0x139/0x1c0
> > [   86.549821]  ? __pfx_bit_cursor+0x10/0x10
> > [   86.549822]  hide_cursor+0x31/0xd0
> > [   86.549825]  vt_console_print+0x477/0x4e0
> > [   86.549828]  console_flush_all+0x182/0x440
> > [   86.549832]  console_unlock+0x58/0xf0
> > [   86.549834]  vprintk_emit+0x1ae/0x200
> > [   86.549837]  vprintk_default+0x1d/0x30
> > [   86.549839]  vprintk+0x5c/0x90
> > [   86.549841]  _printk+0x58/0x80
> > [   86.549843]  __warn_printk+0x7e/0x1a0
> > [   86.549845]  ? trace_preempt_off+0x1b/0x70
> > [   86.549848]  ? trace_preempt_on+0x1b/0x70
> > [   86.549849]  ? __percpu_counter_init+0x8e/0xb0
> > [   86.549853]  refcount_warn_saturate+0x9f/0x150
> > [   86.549855]  mm_init+0x379/0x390
> > [   86.549859]  bpf_trampoline_6442453440_0+0x23/0x1000
> > [   86.549862]  mm_init+0x5/0x390
> > [   86.549865]  ? mm_alloc+0x4e/0x60
> > [   86.549866]  alloc_bprm+0x8a/0x2e0
> > [   86.549869]  do_execveat_common.isra.0+0x67/0x240
> > [   86.549872]  __x64_sys_execve+0x37/0x50
> > [   86.549874]  do_syscall_64+0x38/0x90
> > [   86.549877]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> >
> > The reason is that when we attach the btf id of the function mm_init we
> > actually attach the mm_init defined in init/main.c rather than the
> > function defined in kernel/fork.c. That can be proved by parsing
> > /sys/kernel/btf/vmlinux:
> >
> > [2493] FUNC 'initcall_blacklist' type_id=2477 linkage=static
> > [2494] FUNC_PROTO '(anon)' ret_type_id=21 vlen=1
> >         'buf' type_id=57
> > [2495] FUNC 'early_randomize_kstack_offset' type_id=2494 linkage=static
> > [2496] FUNC 'mm_init' type_id=118 linkage=static
> > [2497] FUNC 'trap_init' type_id=118 linkage=static
> > [2498] FUNC 'thread_stack_cache_init' type_id=118 linkage=static
> >
> > From the above information we can find that the FUNCs above and below
> > mm_init are all defined in init/main.c. So there's no doubt that the
> > mm_init is also the function defined in init/main.c.
> >
> > So when a task calls mm_init and thus the bpf trampoline is triggered it
> > will use the information of the mm_init defined in init/main.c. Then the
> > panic will occur.
> >
> > It seems that there're issues in btf, for example it is unnecessary to
> > generate btf for the functions annonated with __init. We need to improve
> > btf. However we also need to change the function defined in
> > kernel/fork.c to task_mm_init to better distinguish them. After it is
> > renamed to task_mm_init, the /sys/kernel/btf/vmlinux will be:
> >
> > [13970] FUNC 'mm_alloc' type_id=13969 linkage=static
> > [13971] FUNC_PROTO '(anon)' ret_type_id=204 vlen=3
> >         'mm' type_id=204
> >         'p' type_id=197
> >         'user_ns' type_id=452
> > [13972] FUNC 'task_mm_init' type_id=13971 linkage=static
> > [13973] FUNC 'coredump_filter_setup' type_id=3804 linkage=static
> > [13974] FUNC_PROTO '(anon)' ret_type_id=197 vlen=2
> >         'orig' type_id=197
> >         'node' type_id=21
> > [13975] FUNC 'dup_task_struct' type_id=13974 linkage=static
> >
> > And then attaching task_mm_init won't panic. Improving the btf will be
> > handled later.
>
> We're not going to hack the kernel to workaround pahole issue.
> Let's fix pahole instead.
> cc-ing Alan for ideas.

Any comment on it, Alan ?
I think we can just skip generating BTF for the functions in
__section(".init.text"),  as these functions will be freed after
kernel init. There won't be use cases for them.

-- 
Regards
Yafang




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux