Re: [tip: perf/core] perf/core: Fix hardlockup failure caused by perf throttle

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jihong and perf experts,

Greetings!

There was "BUG: soft lockup in asm_sysvec_apic_timer_interrupt" in v6.11-rc4
mainline kernel by local syzkaller Intel internal kernel testing.

Bisected and found first bad commit:
"
15def34e2635 perf/core: Fix hardlockup failure caused by perf throttle
"
After reverted above commit on top of v6.11-rc4 mainline kernel, this
issue was gone.

And this issue could be reproduced in 1200s.

All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/240823_212601_asm_sysvec_apic_timer_interrupt
Syzkaller repro code: https://github.com/xupengfe/syzkaller_logs/blob/main/240823_212601_asm_sysvec_apic_timer_interrupt/repro.c
Syzkaller repro syscall steps: https://github.com/xupengfe/syzkaller_logs/blob/main/240823_212601_asm_sysvec_apic_timer_interrupt/repro.prog
Syzkaller report: https://github.com/xupengfe/syzkaller_logs/blob/main/240823_212601_asm_sysvec_apic_timer_interrupt/repro.report
Kconfig(make olddefconfig): https://github.com/xupengfe/syzkaller_logs/blob/main/240823_212601_asm_sysvec_apic_timer_interrupt/kconfig_origin
Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/240823_212601_asm_sysvec_apic_timer_interrupt/bisect_info.log
bzImage: https://github.com/xupengfe/syzkaller_logs/raw/main/240823_212601_asm_sysvec_apic_timer_interrupt/bzImage_47ac09b91befbb6a235ab620c32af719f8208399.tar.gz
Issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/240823_212601_asm_sysvec_apic_timer_interrupt/47ac09b91befbb6a235ab620c32af719f8208399_dmesg.log

"
[   22.518698] hrtimer: interrupt took 13103 ns
[   30.382700] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 2079936720 wd_nsec: 2079936828
[  378.038693] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 7719948786 wd_nsec: 7719948793
[  736.508865] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [repro:193160]
[  736.509218] Modules linked in:
[  736.509369] irq event stamp: 15405229
[  736.509539] hardirqs last  enabled at (15405228): [<ffffffff8579c9de>] irqentry_exit+0x3e/0xa0
[  736.509947] hardirqs last disabled at (15405229): [<ffffffff8579ae14>] sysvec_apic_timer_interrupt+0x14/0xc0
[  736.510383] softirqs last  enabled at (9218742): [<ffffffff81289fb9>] __irq_exit_rcu+0xa9/0x120
[  736.510776] softirqs last disabled at (9218745): [<ffffffff81289fb9>] __irq_exit_rcu+0xa9/0x120
[  736.511167] CPU: 0 UID: 0 PID: 193160 Comm: repro Not tainted 6.11.0-rc4-47ac09b91bef+ #1
[  736.511529] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[  736.512039] RIP: 0010:__rcu_read_unlock+0x284/0x560
[  736.512272] Code: 8f 00 00 00 84 d2 0f 84 87 00 00 00 bf 09 00 00 00 e8 d0 0b dc ff 4d 85 f6 0f 84 68 fe ff ff e8 f2 83 26 00 fb 0f 1f 44 00 00 <e9> 58 fe ff ff 0f 0b 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d c3
[  736.513091] RSP: 0018:ffff88806c609938 EFLAGS: 00000212
[  736.513330] RAX: 0000000000e956e0 RBX: ffff88806c649600 RCX: 1ffffffff14ae71b
[  736.513648] RDX: 0000000000000000 RSI: 0000000000000101 RDI: 0000000000000000
[  736.513974] RBP: ffff88806c609968 R08: 0000000000000001 R09: fffffbfff14a92b5
[  736.514289] R10: ffffffff8a5495af R11: 0000000000000001 R12: ffff88801379ca00
[  736.514606] R13: ffff88801379ca00 R14: 0000000000000200 R15: ffffffff86e619c0
[  736.514935] FS:  00007f095e148740(0000) GS:ffff88806c600000(0000) knlGS:0000000000000000
[  736.515293] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  736.515555] CR2: 0000000020000000 CR3: 0000000021944006 CR4: 0000000000770ef0
[  736.515888] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  736.516215] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[  736.516539] PKRU: 55555554
[  736.516672] Call Trace:
[  736.516798]  <IRQ>
[  736.516907]  ? show_regs+0xa8/0xc0
[  736.517080]  ? watchdog_timer_fn+0x52f/0x6b0
[  736.517284]  ? __pfx_watchdog_timer_fn+0x10/0x10
[  736.517503]  ? __hrtimer_run_queues+0x5d6/0xb10
[  736.517733]  ? __pfx___hrtimer_run_queues+0x10/0x10
[  736.517975]  ? hrtimer_interrupt+0x324/0x7a0
[  736.518193]  ? __sysvec_apic_timer_interrupt+0x10b/0x410
[  736.518443]  ? debug_smp_processor_id+0x20/0x30
[  736.518663]  ? sysvec_apic_timer_interrupt+0x4b/0xc0
[  736.518901]  ? asm_sysvec_apic_timer_interrupt+0x1f/0x30
[  736.519162]  ? __rcu_read_unlock+0x284/0x560
[  736.519370]  ? __rcu_read_unlock+0x27e/0x560
[  736.519579]  __is_insn_slot_addr+0x14c/0x2a0
[  736.519791]  kernel_text_address+0x64/0xe0
[  736.519991]  __kernel_text_address+0x16/0x50
[  736.520199]  unwind_get_return_address+0x8c/0x100
[  736.520424]  ? __pfx_stack_trace_consume_entry+0x10/0x10
[  736.520671]  arch_stack_walk+0xa7/0x170
[  736.520878]  stack_trace_save+0x97/0xd0
[  736.521064]  ? __pfx_stack_trace_save+0x10/0x10
[  736.521276]  ? __pfx_mark_lock.part.0+0x10/0x10
[  736.521499]  kasan_save_stack+0x2c/0x60
[  736.521681]  ? kasan_save_stack+0x2c/0x60
[  736.521870]  ? kasan_save_track+0x18/0x40
[  736.522057]  ? kasan_save_free_info+0x3f/0x60
[  736.522267]  ? __kasan_slab_free+0x115/0x1a0
[  736.522467]  ? kfree+0xfe/0x330
[  736.522622]  ? free_ctx+0x22/0x30
[  736.522788]  ? rcu_core+0x877/0x18f0
[  736.522967]  ? rcu_core_si+0x12/0x20
[  736.523142]  ? handle_softirqs+0x1c7/0x870
[  736.523334]  ? __irq_exit_rcu+0xa9/0x120
[  736.523519]  ? irq_exit_rcu+0x12/0x30
[  736.523693]  ? sysvec_apic_timer_interrupt+0xa5/0xc0
[  736.523922]  ? asm_sysvec_apic_timer_interrupt+0x1f/0x30
[  736.524165]  ? lock_acquire+0x1ff/0x580
[  736.524352]  ? _raw_spin_lock+0x38/0x50
[  736.524534]  ? do_fcntl+0xa95/0x1400
[  736.524709]  ? __x64_sys_fcntl+0x179/0x210
[  736.524906]  ? x64_sys_call+0x5b9/0x20d0
[  736.525094]  ? do_syscall_64+0x6d/0x140
[  736.525276]  ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  736.525529]  ? __pfx___lock_acquire+0x10/0x10
[  736.525736]  ? do_raw_spin_trylock+0xbf/0x190
[  736.525949]  ? free_unref_page_commit+0x3c0/0xfb0
[  736.526176]  ? __this_cpu_preempt_check+0x21/0x30
[  736.526399]  ? lock_acquire+0x1de/0x580
[  736.526587]  ? free_ctx+0x22/0x30
[  736.526753]  kasan_save_track+0x18/0x40
[  736.526944]  kasan_save_free_info+0x3f/0x60
[  736.527146]  __kasan_slab_free+0x115/0x1a0
[  736.527341]  ? free_ctx+0x22/0x30
[  736.527503]  kfree+0xfe/0x330
[  736.527655]  ? rcu_core+0x875/0x18f0
[  736.527833]  free_ctx+0x22/0x30
[  736.527991]  rcu_core+0x877/0x18f0
[  736.528165]  ? __pfx_rcu_core+0x10/0x10
[  736.528365]  rcu_core_si+0x12/0x20
[  736.528535]  handle_softirqs+0x1c7/0x870
[  736.528734]  __irq_exit_rcu+0xa9/0x120
[  736.528915]  irq_exit_rcu+0x12/0x30
[  736.529085]  sysvec_apic_timer_interrupt+0xa5/0xc0
[  736.529309]  </IRQ>
[  736.529414]  <TASK>
[  736.529522]  asm_sysvec_apic_timer_interrupt+0x1f/0x30
[  736.529762] RIP: 0010:lock_acquire+0x1ff/0x580
[  736.529974] Code: 48 83 c4 28 e8 72 73 37 04 b8 ff ff ff ff 65 0f c1 05 fd b6 c0 7e 83 f8 01 0f 85 d9 02 00 00 4d 85 ff 74 06 fb 0f 1f 44 00 00 <48> b8 00 00 00 00 00 fc ff df 48 01 c3 48 c7 03 00 00 00 00 48 c7
[  736.530786] RSP: 0018:ffff88801aa0fcd0 EFLAGS: 00000206
[  736.531031] RAX: 0000000000000001 RBX: 1ffff11003541f9e RCX: 1ffff11003541f82
[  736.531348] RDX: 1ffff110026f3b07 RSI: 0000000000000001 RDI: 0000000000000000
[  736.531668] RBP: ffff88801aa0fdb8 R08: 0000000000000000 R09: fffffbfff14a92b5
[  736.531994] R10: ffffffff8a5495af R11: 0000000000000001 R12: 0000000000000001
[  736.532318] R13: 0000000000000000 R14: ffff88800f65a028 R15: 0000000000000200
[  736.532660]  ? __pfx_lock_acquire+0x10/0x10
[  736.532860]  ? _raw_spin_unlock+0x31/0x60
[  736.533063]  ? up_write+0x1c0/0x550
[  736.533234]  ? fasync_helper+0x77/0xc0
[  736.533420]  _raw_spin_lock+0x38/0x50
[  736.533595]  ? do_fcntl+0xa95/0x1400
[  736.533775]  do_fcntl+0xa95/0x1400
[  736.533948]  ? __pfx_do_fcntl+0x10/0x10
[  736.534137]  ? trace_hardirqs_on+0x51/0x60
[  736.534336]  ? seqcount_lockdep_reader_access.constprop.0+0xc0/0xd0
[  736.534619]  ? __sanitizer_cov_trace_cmp4+0x1a/0x20
[  736.534853]  ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20
[  736.535104]  ? security_file_fcntl+0x9d/0xd0
[  736.535315]  __x64_sys_fcntl+0x179/0x210
[  736.535508]  x64_sys_call+0x5b9/0x20d0
[  736.535687]  do_syscall_64+0x6d/0x140
[  736.535866]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  736.536099] RIP: 0033:0x7f095de3ee5d
[  736.536273] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 93 af 1b 00 f7 d8 64 89 01 48
[  736.537075] RSP: 002b:00007ffea8175048 EFLAGS: 00000217 ORIG_RAX: 0000000000000048
[  736.537411] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f095de3ee5d
[  736.537736] RDX: 0000000000002400 RSI: 0000000000000004 RDI: 0000000000000003
[  736.538053] RBP: 00007ffea8175060 R08: 00007ffea8175060 R09: 00007ffea8175060
[  736.538371] R10: 00007ffea8175060 R11: 0000000000000217 R12: 00007ffea81751d8
[  736.538689] R13: 000000000040547a R14: 0000000000407df8 R15: 00007f095e193000
[  736.539023]  </TASK>
[  736.539132] Kernel panic - not syncing: softlockup: hung tasks
"

Hope it's helpful.

Thanks!

---

If you don't need the following environment to reproduce the problem or if you
already have one reproduced environment, please ignore the following information.

How to reproduce:
git clone https://gitlab.com/xupengfe/repro_vm_env.git
cd repro_vm_env
tar -xvf repro_vm_env.tar.gz
cd repro_vm_env; ./start3.sh  // it needs qemu-system-x86_64 and I used v7.1.0
  // start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
  // You could change the bzImage_xxx as you want
  // Maybe you need to remove line "-drive if=pflash,format=raw,readonly=on,file=./OVMF_CODE.fd \" for different qemu version
You could use below command to log in, there is no password for root.
ssh -p 10023 root@localhost

After login vm(virtual machine) successfully, you could transfer reproduced
binary to the vm by below way, and reproduce the problem in vm:
gcc -pthread -o repro repro.c
scp -P 10023 repro root@localhost:/root/

Get the bzImage for target kernel:
Please use target kconfig and copy it to kernel_src/.config
make olddefconfig
make -jx bzImage           //x should equal or less than cpu num your pc has

Fill the bzImage file into above start3.sh to load the target kernel in vm.


Tips:
If you already have qemu-system-x86_64, please ignore below info.
If you want to install qemu v7.1.0 version:
git clone https://github.com/qemu/qemu.git
cd qemu
git checkout -f v7.1.0
mkdir build
cd build
yum install -y ninja-build.x86_64
yum -y install libslirp-devel.x86_64
../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl --enable-usb-redir --enable-slirp
make
make install

Best Regards,
Thanks!


On 2023-04-17 at 10:46:22 -0000, tip-bot2 for Yang Jihong wrote:
> The following commit has been merged into the perf/core branch of tip:
> 
> Commit-ID:     15def34e2635ab7e0e96f1bc32e1b69609f14942
> Gitweb:        https://git.kernel.org/tip/15def34e2635ab7e0e96f1bc32e1b69609f14942
> Author:        Yang Jihong <yangjihong1@xxxxxxxxxx>
> AuthorDate:    Mon, 27 Feb 2023 10:35:08 +08:00
> Committer:     Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> CommitterDate: Fri, 14 Apr 2023 16:08:22 +02:00
> 
> perf/core: Fix hardlockup failure caused by perf throttle
> 
> commit e050e3f0a71bf ("perf: Fix broken interrupt rate throttling")
> introduces a change in throttling threshold judgment. Before this,
> compare hwc->interrupts and max_samples_per_tick, then increase
> hwc->interrupts by 1, but this commit reverses order of these two
> behaviors, causing the semantics of max_samples_per_tick to change.
> In literal sense of "max_samples_per_tick", if hwc->interrupts ==
> max_samples_per_tick, it should not be throttled, therefore, the judgment
> condition should be changed to "hwc->interrupts > max_samples_per_tick".
> 
> In fact, this may cause the hardlockup to fail, The minimum value of
> max_samples_per_tick may be 1, in this case, the return value of
> __perf_event_account_interrupt function is 1.
> As a result, nmi_watchdog gets throttled, which would stop PMU (Use x86
> architecture as an example, see x86_pmu_handle_irq).
> 
> Fixes: e050e3f0a71b ("perf: Fix broken interrupt rate throttling")
> Signed-off-by: Yang Jihong <yangjihong1@xxxxxxxxxx>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> Link: https://lkml.kernel.org/r/20230227023508.102230-1-yangjihong1@xxxxxxxxxx
> ---
>  kernel/events/core.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index fb3e436..82b95b8 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -9433,8 +9433,8 @@ __perf_event_account_interrupt(struct perf_event *event, int throttle)
>  		hwc->interrupts = 1;
>  	} else {
>  		hwc->interrupts++;
> -		if (unlikely(throttle
> -			     && hwc->interrupts >= max_samples_per_tick)) {
> +		if (unlikely(throttle &&
> +			     hwc->interrupts > max_samples_per_tick)) {
>  			__this_cpu_inc(perf_throttled_count);
>  			tick_dep_set_cpu(smp_processor_id(), TICK_DEP_BIT_PERF_EVENTS);
>  			hwc->interrupts = MAX_INTERRUPTS;




[Index of Archives]     [Linux Stable Commits]     [Linux Stable Kernel]     [Linux Kernel]     [Linux USB Devel]     [Linux Video &Media]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux