Re: [PATCHv2 bpf 0/2] bpf: Disable preemption in perf_event_output helpers code

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 7/20/2023 4:57 PM, Jiri Olsa wrote:
> hi,
> we got report of kernel crash [1][3] within bpf_event_output helper.
>
> The reason is the nesting protection code in bpf_event_output that expects
> disabled preemption, which is not guaranteed for programs executed by
> bpf_prog_run_array_cg.
>
> I managed to reproduce on tracing side where we have the same problem
> in bpf_perf_event_output. The reproducer [2] just creates busy uprobe
> and call bpf_perf_event_output helper a lot.

It is a little strange that I can not reproduce the same problem on my
local VM by using the same reproducer from [2]. I have already enabled
both CONFIG_PREEMPT=y and CONFIG_PREEMPT_RCU=y.
>
> v2 changes:
>   - I changed 'Fixes' commits to where I saw we switched from preempt_disable
>     to migrate_disable, but I'm not completely sure about the patch 2, because
>     it was tricky to find, would be nice if somebody could check on that

After digging into the change log, I think the fix tag is correct. In
commit 2a916f2f546c, preempt_disable() is changed to migrate_disable()
in the macro BPF_PROG_RUN_ARRAY(), and in commit 772412176fb9,
BPF_PROG_RUN_ARRAY_FLAGS() was added for
__cgroup_bpf_run_filter_sock_addr() and the implementation of
BPF_PROG_RUN_ARRAY_FLAGS() was basically copied from
BPF_PROG_RUN_ARRAY(). And afterwards, BPF_PROG_RUN_ARRAY_FLAGS was
rename to BPF_PROG_RUN_ARRAY_CG_FLAGS, then
bpf_prog_run_array_cg_flags(), and was renamed to
bpf_prog_run_array_cg() in the final.

>
> thanks,
> jirka
>
>
> [1] https://github.com/cilium/cilium/issues/26756
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git/commit/?h=bpf_output_fix_reproducer&id=8054dcc634121b884c7c331329d61d93351d03b5
> [3] slack:
>     [66194.378161] BUG: kernel NULL pointer dereference, address: 0000000000000001
>     [66194.378324] #PF: supervisor instruction fetch in kernel mode
>     [66194.378447] #PF: error_code(0x0010) - not-present page
>     ...
>     [66194.378692] Oops: 0010 [#1] PREEMPT SMP NOPTI
>     ...
>     [66194.380666]  <TASK>
>     [66194.380775]  ? perf_output_sample+0x12a/0x9a0
>     [66194.380902]  ? finish_task_switch.isra.0+0x81/0x280
>     [66194.381024]  ? perf_event_output+0x66/0xa0
>     [66194.381148]  ? bpf_event_output+0x13a/0x190
>     [66194.381270]  ? bpf_event_output_data+0x22/0x40
>     [66194.381391]  ? bpf_prog_dfc84bbde731b257_cil_sock4_connect+0x40a/0xacb
>     [66194.381519]  ? xa_load+0x87/0xe0
>     [66194.381635]  ? __cgroup_bpf_run_filter_sock_addr+0xc1/0x1a0
>     [66194.381759]  ? release_sock+0x3e/0x90
>     [66194.381876]  ? sk_setsockopt+0x1a1/0x12f0
>     [66194.381996]  ? udp_pre_connect+0x36/0x50
>     [66194.382114]  ? inet_dgram_connect+0x93/0xa0
>     [66194.382233]  ? __sys_connect+0xb4/0xe0
>     [66194.382353]  ? udp_setsockopt+0x27/0x40
>     [66194.382470]  ? __pfx_udp_push_pending_frames+0x10/0x10
>     [66194.382593]  ? __sys_setsockopt+0xdf/0x1a0
>     [66194.382713]  ? __x64_sys_connect+0xf/0x20
>     [66194.382832]  ? do_syscall_64+0x3a/0x90
>     [66194.382949]  ? entry_SYSCALL_64_after_hwframe+0x72/0xdc
>     [66194.383077]  </TASK>
>
>
> ---
> Jiri Olsa (2):
>       bpf: Disable preemption in bpf_perf_event_output
>       bpf: Disable preemption in bpf_event_output
>
>  kernel/trace/bpf_trace.c | 18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
>
> .





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux