On Fri, Jan 12, 2024 at 07:10:50PM +0900, Masami Hiramatsu (Google) wrote: > Hi, > > Here is the 6th version of the series to re-implement the fprobe on > function-graph tracer. The previous version is; > > https://lore.kernel.org/all/170290509018.220107.1347127510564358608.stgit@devnote2/ > > This version fixes use-after-unregister bug and arm64 stack unwinding > bug [13/36], add an improvement for multiple interrupts during push > operation[20/36], keep SAVE_REGS until BPF and fprobe_event using > ftrace_regs[26/36], also reorder the patches[30/36][31/36] so that new > fprobe can switch to SAVE_ARGS[32/36] safely. > This series also temporarily adds a DIRECT_CALLS bugfix[1/36], which > should be pushed separatedly as a stable bugfix. > > There are some TODOs: > - Add s390x and loongarch support to fprobe (multiple fgraph). > - Fix to get the symbol address from ftrace entry address on arm64. > (This should be done in BPF trace event) > - Cleanup code, rename some terms(offset/index) and FGRAPH_TYPE_BITMAP > part should be merged to FGRAPH_TYPE_ARRAY patch. hi, I'm getting kasan bugs below when running bpf selftests on top of this patchset.. I think it's probably the reason I see failures in some bpf kprobe_multi/fprobe tests so far I couldn't find the reason.. still checking ;-) jirka --- [ 507.585913][ T697] BUG: KASAN: slab-out-of-bounds in ftrace_push_return_trace.isra.0+0x346/0x370 [ 507.586747][ T697] Write of size 8 at addr ffff888148193ff8 by task test_progs/697 [ 507.587460][ T697] [ 507.587713][ T697] CPU: 2 PID: 697 Comm: test_progs Tainted: G OE 6.7.0+ #309 d8e2cbcdc10865c6eb2d28ed0cbf958842aa75a8 [ 507.588821][ T697] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014 [ 507.589681][ T697] Call Trace: [ 507.590044][ T697] <TASK> [ 507.590357][ T697] dump_stack_lvl+0xf6/0x180 [ 507.590807][ T697] print_report+0xc4/0x610 [ 507.591259][ T697] ? fixup_red_left+0x5/0x20 [ 507.591781][ T697] kasan_report+0xbe/0xf0 [ 507.592241][ T697] ? ftrace_push_return_trace.isra.0+0x346/0x370 [ 507.592928][ T697] ? ftrace_push_return_trace.isra.0+0x346/0x370 [ 507.593535][ T697] ? __pfx_text_poke_loc_init+0x10/0x10 [ 507.594076][ T697] ? ftrace_replace_code+0x17a/0x230 [ 507.594586][ T697] ftrace_push_return_trace.isra.0+0x346/0x370 [ 507.595192][ T697] ? __pfx_text_poke_loc_init+0x10/0x10 [ 507.595747][ T697] function_graph_enter_ops+0xbb/0x2d0 [ 507.596271][ T697] ? ftrace_replace_code+0x17a/0x230 [ 507.596784][ T697] ? __pfx_function_graph_enter_ops+0x10/0x10 [ 507.597353][ T697] ? preempt_count_sub+0x14/0xc0 [ 507.598576][ T697] ? __pfx_text_poke_loc_init+0x10/0x10 [ 507.599145][ T697] ? __pfx_fuse_sync_fs+0x10/0x10 [ 507.599718][ T697] ftrace_graph_func+0x142/0x270 [ 507.600293][ T697] ? __pfx_text_poke_loc_init+0x10/0x10 [ 507.600892][ T697] ? __pfx_fuse_conn_put.part.0+0x10/0x10 [ 507.601484][ T697] 0xffffffffa0560097 [ 507.602067][ T697] ? __pfx_fuse_conn_put.part.0+0x10/0x10 [ 507.602715][ T697] ? text_poke_loc_init+0x5/0x2e0 [ 507.603288][ T697] ? __pfx_fuse_conn_put.part.0+0x10/0x10 [ 507.603923][ T697] text_poke_loc_init+0x5/0x2e0 [ 507.604468][ T697] ftrace_replace_code+0x17a/0x230 [ 507.605071][ T697] ftrace_modify_all_code+0x131/0x1a0 [ 507.605663][ T697] ftrace_startup+0x10b/0x210 [ 507.606200][ T697] register_ftrace_graph+0x313/0x8a0 [ 507.606805][ T697] ? register_ftrace_graph+0x3fe/0x8a0 [ 507.607427][ T697] register_fprobe_ips.part.0+0x25a/0x3f0 [ 507.608090][ T697] bpf_kprobe_multi_link_attach+0x49e/0x850 [ 507.608781][ T697] ? __pfx_bpf_kprobe_multi_link_attach+0x10/0x10 [ 507.609500][ T697] ? __debug_check_no_obj_freed+0x1d8/0x3a0 [ 507.610194][ T697] ? __fget_light+0x96/0xe0 [ 507.610741][ T697] __sys_bpf+0x307a/0x3180 [ 507.611286][ T697] ? __pfx___sys_bpf+0x10/0x10 [ 507.611838][ T697] ? __kasan_slab_free+0x12d/0x1c0 [ 507.612434][ T697] ? audit_log_exit+0x8e0/0x1960 [ 507.613003][ T697] ? kmem_cache_free+0x19d/0x460 [ 507.613644][ T697] ? rcu_is_watching+0x34/0x60 [ 507.614202][ T697] ? lockdep_hardirqs_on_prepare+0xe/0x250 [ 507.614865][ T697] ? seqcount_lockdep_reader_access.constprop.0+0x105/0x120 [ 507.615662][ T697] ? seqcount_lockdep_reader_access.constprop.0+0xb2/0x120 [ 507.616431][ T697] __x64_sys_bpf+0x44/0x60 [ 507.616940][ T697] do_syscall_64+0x87/0x1b0 [ 507.617495][ T697] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 507.618179][ T697] RIP: 0033:0x7ff2edca6b4d [ 507.618745][ T697] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b 92 0c 00 f7 d8 64 89 01 48 [ 507.620863][ T697] RSP: 002b:00007ffe2e58a8f8 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 507.621749][ T697] RAX: ffffffffffffffda RBX: 00007ffe2e58b018 RCX: 00007ff2edca6b4d [ 507.622580][ T697] RDX: 0000000000000040 RSI: 00007ffe2e58a970 RDI: 000000000000001c [ 507.623395][ T697] RBP: 00007ffe2e58a910 R08: 0000000000000001 R09: 00007ffe2e58a970 [ 507.624198][ T697] R10: 0000000000000007 R11: 0000000000000206 R12: 0000000000000001 [ 507.625029][ T697] R13: 0000000000000000 R14: 00007ff2eddee000 R15: 0000000000edcdb0 [ 507.625989][ T697] </TASK> [ 507.626377][ T697] [ 507.626678][ T697] Allocated by task 697: [ 507.627175][ T697] kasan_save_stack+0x1c/0x40 [ 507.627757][ T697] kasan_save_track+0x10/0x30 [ 507.628992][ T697] __kasan_kmalloc+0xa6/0xb0 [ 507.629502][ T697] register_ftrace_graph+0x42b/0x8a0 [ 507.630071][ T697] register_fprobe_ips.part.0+0x25a/0x3f0 [ 507.630649][ T697] bpf_kprobe_multi_link_attach+0x49e/0x850 [ 507.631218][ T697] __sys_bpf+0x307a/0x3180 [ 507.631662][ T697] __x64_sys_bpf+0x44/0x60 [ 507.632128][ T697] do_syscall_64+0x87/0x1b0 [ 507.632577][ T697] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 507.633136][ T697] [ 507.633399][ T697] The buggy address belongs to the object at ffff888148194000 [ 507.633399][ T697] which belongs to the cache kmalloc-4k of size 4096 [ 507.634667][ T697] The buggy address is located 8 bytes to the left of [ 507.634667][ T697] allocated 4096-byte region [ffff888148194000, ffff888148195000) [ 507.636028][ T697] [ 507.636392][ T697] The buggy address belongs to the physical page: [ 507.637106][ T697] page:ffffea0005206400 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x148190 [ 507.638206][ T697] head:ffffea0005206400 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 507.639082][ T697] flags: 0x17ffe000000840(slab|head|node=0|zone=2|lastcpupid=0x3fff) [ 507.639858][ T697] page_type: 0xffffffff() [ 507.640306][ T697] raw: 0017ffe000000840 ffff888100043a40 ffffea0005a89610 ffffea0005d0de10 [ 507.641244][ T697] raw: 0000000000000000 0000000000020002 00000001ffffffff 0000000000000000 [ 507.642184][ T697] page dumped because: kasan: bad access detected [ 507.642853][ T697] [ 507.643152][ T697] Memory state around the buggy address: [ 507.643773][ T697] ffff888148193e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 507.644661][ T697] ffff888148193f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 507.645490][ T697] >ffff888148193f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 507.646263][ T697] ^ [ 507.647025][ T697] ffff888148194000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 507.647821][ T697] ffff888148194080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 507.648588][ T697] ================================================================== --- [ 367.302586][ T973] BUG: KASAN: slab-out-of-bounds in ftrace_push_return_trace.isra.0+0x346/0x370 [ 367.303380][ T973] Write of size 8 at addr ffff888174e58ff8 by task kworker/u12:10/973 [ 367.304243][ T973] [ 367.304599][ T973] CPU: 3 PID: 973 Comm: kworker/u12:10 Tainted: G OE 6.7.0+ #312 a13024221ef8bb3aaeade334af809d35027f09e3 [ 367.305894][ T973] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014 [ 367.306834][ T973] Workqueue: writeback wb_workfn (flush-253:0) [ 367.307415][ T973] Call Trace: [ 367.307786][ T973] <TASK> [ 367.308130][ T973] dump_stack_lvl+0xf6/0x180 [ 367.308610][ T973] print_report+0xc4/0x610 [ 367.309090][ T973] ? fixup_red_left+0x5/0x20 [ 367.309557][ T973] kasan_report+0xbe/0xf0 [ 367.309926][ T973] ? ftrace_push_return_trace.isra.0+0x346/0x370 [ 367.310404][ T973] ? ftrace_push_return_trace.isra.0+0x346/0x370 [ 367.310932][ T973] ? __pfx_unwind_get_return_address+0x10/0x10 [ 367.311529][ T973] ? arch_stack_walk+0x9e/0xf0 [ 367.312092][ T973] ftrace_push_return_trace.isra.0+0x346/0x370 [ 367.312769][ T973] ? __pfx_function_graph_enter_ops+0x10/0x10 [ 367.313414][ T973] ? __pfx_unwind_get_return_address+0x10/0x10 [ 367.313946][ T973] function_graph_enter_ops+0xbb/0x2d0 [ 367.314518][ T973] ? arch_stack_walk+0x9e/0xf0 [ 367.315016][ T973] ? __pfx_function_graph_enter_ops+0x10/0x10 [ 367.315540][ T973] ? ftrace_graph_func+0x173/0x270 [ 367.316084][ T973] ? __pfx_preempt_count_sub+0x10/0x10 [ 367.317791][ T973] ftrace_graph_func+0x142/0x270 [ 367.318231][ T973] ? __pfx_unwind_get_return_address+0x10/0x10 [ 367.318690][ T973] ? __pfx_stack_trace_consume_entry+0x10/0x10 [ 367.319159][ T973] 0xffffffffa0568097 [ 367.319542][ T973] ? preempt_count_sub+0x5/0xc0 [ 367.320073][ T973] ? preempt_count_sub+0x14/0xc0 [ 367.320610][ T973] ? preempt_count_sub+0x14/0xc0 [ 367.321226][ T973] ? unwind_get_return_address+0x5/0x50 [ 367.321862][ T973] unwind_get_return_address+0x5/0x50 [ 367.322449][ T973] arch_stack_walk+0x9e/0xf0 [ 367.322997][ T973] ? worker_thread+0x393/0x680 [ 367.323553][ T973] stack_trace_save+0x91/0xd0 [ 367.324087][ T973] ? __pfx_stack_trace_save+0x10/0x10 [ 367.324708][ T973] ? stack_trace_save+0x5/0xd0 [ 367.325244][ T973] ? kvm_sched_clock_read+0xd/0x20 [ 367.325822][ T973] kasan_save_stack+0x1c/0x40 [ 367.326326][ T973] ? kasan_save_stack+0x1c/0x40 [ 367.326828][ T973] ? kasan_save_track+0x10/0x30 [ 367.327298][ T973] ? __kasan_slab_alloc+0x85/0x90 [ 367.327803][ T973] ? kmem_cache_alloc+0x159/0x460 [ 367.328383][ T973] ? mempool_alloc+0xe1/0x270 [ 367.328925][ T973] ? bio_alloc_bioset+0x2c6/0x440 [ 367.329435][ T973] ? iomap_do_writepage+0x784/0x1090 [ 367.329993][ T973] ? write_cache_pages+0x2be/0x720 [ 367.330506][ T973] ? iomap_writepages+0x39/0x70 [ 367.331015][ T973] ? xfs_vm_writepages+0xf6/0x150 [ 367.331583][ T973] ? do_writepages+0x104/0x320 [ 367.332163][ T973] ? __writeback_single_inode+0x9e/0x6e0 [ 367.332754][ T973] ? writeback_sb_inodes+0x384/0x830 [ 367.333329][ T973] ? __writeback_inodes_wb+0x70/0x130 [ 367.333915][ T973] ? wb_writeback+0x486/0x5b0 [ 367.334437][ T973] ? wb_workfn+0x68b/0x8c0 [ 367.334942][ T973] ? process_one_work+0x48c/0x970 [ 367.335574][ T973] ? __pfx_function_graph_enter_ops+0x10/0x10 [ 367.336257][ T973] ? kmem_cache_alloc+0x415/0x460 [ 367.336818][ T973] ? __pfx_preempt_count_sub+0x10/0x10 [ 367.337426][ T973] ? iomap_do_writepage+0x784/0x1090 [ 367.337947][ T973] ? write_cache_pages+0x2be/0x720 [ 367.338475][ T973] ? iomap_writepages+0x39/0x70 [ 367.338981][ T973] ? xfs_vm_writepages+0xf6/0x150 [ 367.339514][ T973] ? do_writepages+0x104/0x320 [ 367.340065][ T973] ? ftrace_graph_func+0x173/0x270 [ 367.340658][ T973] ? __pfx_preempt_count_sub+0x10/0x10 [ 367.341225][ T973] ? mempool_alloc+0xe1/0x270 [ 367.341729][ T973] ? 0xffffffffa0568097 [ 367.342155][ T973] ? 0xffffffffa0568097 [ 367.342598][ T973] ? preempt_count_sub+0x5/0xc0 [ 367.343109][ T973] ? mempool_alloc+0xe1/0x270 [ 367.343590][ T973] kasan_save_track+0x10/0x30 [ 367.344040][ T973] __kasan_slab_alloc+0x85/0x90 [ 367.344542][ T973] kmem_cache_alloc+0x159/0x460 [ 367.345024][ T973] ? __pfx_mempool_alloc_slab+0x10/0x10 [ 367.345518][ T973] mempool_alloc+0xe1/0x270 [ 367.345997][ T973] ? __pfx_mempool_alloc+0x10/0x10 [ 367.346526][ T973] ? 0xffffffffa0568097 [ 367.346973][ T973] ? ftrace_push_return_trace.isra.0+0x17c/0x370 [ 367.348141][ T973] bio_alloc_bioset+0x2c6/0x440 [ 367.348653][ T973] ? __pfx_bio_alloc_bioset+0x10/0x10 [ 367.349221][ T973] iomap_do_writepage+0x784/0x1090 [ 367.349763][ T973] ? __pfx_iomap_do_writepage+0x10/0x10 [ 367.350306][ T973] ? __pfx_iomap_do_writepage+0x10/0x10 [ 367.350860][ T973] write_cache_pages+0x2be/0x720 [ 367.351366][ T973] ? __pfx_iomap_do_writepage+0x10/0x10 [ 367.351921][ T973] ? __pfx_write_cache_pages+0x10/0x10 [ 367.352468][ T973] ? ftrace_graph_func+0x173/0x270 [ 367.352967][ T973] ? ksys_fadvise64_64+0x8c/0xc0 [ 367.357367][ T973] ? __pfx_lock_release+0x10/0x10 [ 367.357872][ T973] ? __pfx_iomap_do_writepage+0x10/0x10 [ 367.358440][ T973] iomap_writepages+0x39/0x70 [ 367.358953][ T973] xfs_vm_writepages+0xf6/0x150 [ 367.359505][ T973] ? __pfx_xfs_vm_writepages+0x10/0x10 [ 367.360108][ T973] ? __pfx_xfs_vm_writepages+0x10/0x10 [ 367.360740][ T973] ? __pfx_xfs_vm_writepages+0x10/0x10 [ 367.361343][ T973] ? do_writepages+0x164/0x320 [ 367.361893][ T973] do_writepages+0x104/0x320 [ 367.362414][ T973] ? __pfx_do_writepages+0x10/0x10 [ 367.362957][ T973] ? 0xffffffffa0568097 [ 367.363449][ T973] ? __pfx___writeback_single_inode+0x10/0x10 [ 367.364114][ T973] __writeback_single_inode+0x9e/0x6e0 [ 367.364705][ T973] writeback_sb_inodes+0x384/0x830 [ 367.365295][ T973] ? __pfx_writeback_sb_inodes+0x10/0x10 [ 367.365857][ T973] ? __pfx_writeback_sb_inodes+0x10/0x10 [ 367.366505][ T973] ? super_trylock_shared+0x46/0x70 [ 367.367085][ T973] __writeback_inodes_wb+0x70/0x130 [ 367.367764][ T973] wb_writeback+0x486/0x5b0 [ 367.368300][ T973] ? __pfx_wb_writeback+0x10/0x10 [ 367.368860][ T973] ? 0xffffffffa0568097 [ 367.369339][ T973] ? __pfx_lock_release+0x10/0x10 [ 367.369940][ T973] wb_workfn+0x68b/0x8c0 [ 367.370473][ T973] ? _raw_spin_unlock_irqrestore+0x37/0xb0 [ 367.371131][ T973] ? __pfx_wb_workfn+0x10/0x10 [ 367.371600][ T973] ? ftrace_graph_func+0x173/0x270 [ 367.372110][ T973] ? __pfx_wb_workfn+0x10/0x10 [ 367.372609][ T973] ? 0xffffffffa0568097 [ 367.373044][ T973] ? __pfx_wb_workfn+0x10/0x10 [ 367.373507][ T973] ? process_one_work+0x483/0x970 [ 367.373990][ T973] ? rcu_is_watching+0x34/0x60 [ 367.374474][ T973] process_one_work+0x48c/0x970 [ 367.374956][ T973] ? __pfx_process_one_work+0x10/0x10 [ 367.375474][ T973] ? 0xffffffffa0568097 [ 367.375903][ T973] ? __list_add_valid_or_report+0x33/0xc0 [ 367.376518][ T973] ? worker_thread+0x375/0x680 [ 367.377065][ T973] worker_thread+0x393/0x680 [ 367.378058][ T973] ? __pfx_worker_thread+0x10/0x10 [ 367.378649][ T973] kthread+0x1ad/0x1f0 [ 367.379076][ T973] ? kthread+0xf2/0x1f0 [ 367.379513][ T973] ? __pfx_kthread+0x10/0x10 [ 367.380003][ T973] ret_from_fork+0x2d/0x50 [ 367.380483][ T973] ? __pfx_kthread+0x10/0x10 [ 367.380993][ T973] ret_from_fork_asm+0x1b/0x30 [ 367.381486][ T973] </TASK> [ 367.381815][ T973] [ 367.382070][ T973] Allocated by task 719: [ 367.382508][ T973] kasan_save_stack+0x1c/0x40 [ 367.382988][ T973] kasan_save_track+0x10/0x30 [ 367.383477][ T973] __kasan_kmalloc+0xa6/0xb0 [ 367.383960][ T973] register_ftrace_graph+0x42b/0x8a0 [ 367.384454][ T973] register_fprobe_ips.part.0+0x25a/0x3f0 [ 367.385075][ T973] bpf_kprobe_multi_link_attach+0x49e/0x850 [ 367.385713][ T973] __sys_bpf+0x307a/0x3180 [ 367.386217][ T973] __x64_sys_bpf+0x44/0x60 [ 367.386716][ T973] do_syscall_64+0x87/0x1b0 [ 367.387220][ T973] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 367.387878][ T973] [ 367.388193][ T973] The buggy address belongs to the object at ffff888174e59000 [ 367.388193][ T973] which belongs to the cache kmalloc-4k of size 4096 [ 367.389731][ T973] The buggy address is located 8 bytes to the left of [ 367.389731][ T973] allocated 4096-byte region [ffff888174e59000, ffff888174e5a000) [ 367.391266][ T973] [ 367.391580][ T973] The buggy address belongs to the physical page: [ 367.392285][ T973] page:ffffea0005d39600 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x174e58 [ 367.393398][ T973] head:ffffea0005d39600 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 367.394437][ T973] flags: 0x17ffe000000840(slab|head|node=0|zone=2|lastcpupid=0x3fff) [ 367.395335][ T973] page_type: 0xffffffff() [ 367.395837][ T973] raw: 0017ffe000000840 ffff888100043a40 ffffea000674d010 ffffea0004bc8c10 [ 367.396786][ T973] raw: 0000000000000000 0000000000020002 00000001ffffffff 0000000000000000 [ 367.397728][ T973] page dumped because: kasan: bad access detected [ 367.398458][ T973] [ 367.398766][ T973] Memory state around the buggy address: [ 367.399411][ T973] ffff888174e58e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 367.400296][ T973] ffff888174e58f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 367.401189][ T973] >ffff888174e58f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 367.402093][ T973] ^ [ 367.402983][ T973] ffff888174e59000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 367.403879][ T973] ffff888174e59080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 367.404769][ T973] ==================================================================