On 2024-10-11 7:41 a.m., Peter Zijlstra wrote: > On Wed, Aug 21, 2024 at 01:27:17PM +0800, Mi, Dapeng wrote: >> On 8/1/2024 12:58 PM, Mingwei Zhang wrote: > >>> +static inline u64 __perf_event_time_ctx_now(struct perf_event *event, >>> + struct perf_time_ctx *time, >>> + struct perf_time_ctx *timeguest, >>> + u64 now) >>> +{ >>> + /* >>> + * The exclude_guest event time should be calculated from >>> + * the ctx time - the guest time. >>> + * The ctx time is now + READ_ONCE(time->offset). >>> + * The guest time is now + READ_ONCE(timeguest->offset). >>> + * So the exclude_guest time is >>> + * READ_ONCE(time->offset) - READ_ONCE(timeguest->offset). >>> + */ >>> + if (event->attr.exclude_guest && __this_cpu_read(perf_in_guest)) >> >> Hi Kan, >> >> we see the following the warning when run perf record command after >> enabling "CONFIG_DEBUG_PREEMPT" config item. >> >> [ 166.779208] BUG: using __this_cpu_read() in preemptible [00000000] code: >> perf/9494 >> [ 166.779234] caller is __this_cpu_preempt_check+0x13/0x20 >> [ 166.779241] CPU: 56 UID: 0 PID: 9494 Comm: perf Not tainted >> 6.11.0-rc4-perf-next-mediated-vpmu-v3+ #80 >> [ 166.779245] Hardware name: Quanta Cloud Technology Inc. QuantaGrid >> D54Q-2U/S6Q-MB-MPS, BIOS 3A11.uh 12/02/2022 >> [ 166.779248] Call Trace: >> [ 166.779250] <TASK> >> [ 166.779252] dump_stack_lvl+0x76/0xa0 >> [ 166.779260] dump_stack+0x10/0x20 >> [ 166.779267] check_preemption_disabled+0xd7/0xf0 >> [ 166.779273] __this_cpu_preempt_check+0x13/0x20 >> [ 166.779279] calc_timer_values+0x193/0x200 >> [ 166.779287] perf_event_update_userpage+0x4b/0x170 >> [ 166.779294] ? ring_buffer_attach+0x14c/0x200 >> [ 166.779301] perf_mmap+0x533/0x5d0 >> [ 166.779309] mmap_region+0x243/0xaa0 >> [ 166.779322] do_mmap+0x35b/0x640 >> [ 166.779333] vm_mmap_pgoff+0xf0/0x1c0 >> [ 166.779345] ksys_mmap_pgoff+0x17a/0x250 >> [ 166.779354] __x64_sys_mmap+0x33/0x70 >> [ 166.779362] x64_sys_call+0x1fa4/0x25f0 >> [ 166.779369] do_syscall_64+0x70/0x130 >> >> The season that kernel complains this is __perf_event_time_ctx_now() calls >> __this_cpu_read() in preemption enabled context. >> >> To eliminate the warning, we may need to use this_cpu_read() to replace >> __this_cpu_read(). >> >> diff --git a/kernel/events/core.c b/kernel/events/core.c >> index ccd61fd06e8d..1eb628f8b3a0 100644 >> --- a/kernel/events/core.c >> +++ b/kernel/events/core.c >> @@ -1581,7 +1581,7 @@ static inline u64 __perf_event_time_ctx_now(struct >> perf_event *event, >> * So the exclude_guest time is >> * READ_ONCE(time->offset) - READ_ONCE(timeguest->offset). >> */ >> - if (event->attr.exclude_guest && __this_cpu_read(perf_in_guest)) >> + if (event->attr.exclude_guest && this_cpu_read(perf_in_guest)) >> return READ_ONCE(time->offset) - READ_ONCE(timeguest->offset); >> else >> return now + READ_ONCE(time->offset); > > The saner fix is moving the preempt_disable() in > perf_event_update_userpage() up a few lines, no? Yes, the preempt_disable() is to guarantee consistent time stamps. It makes sense to include the time calculation part in the preempt_disable(). Since the time-related code was updated recently, I will fold all the suggestions into the newly re-based code. Thanks, Kan