[Bug 218980] New: [VM boot] Guest Kernel hit BUG: kernel NULL pointer dereference, address: 0000000000000010 and WARNING: CPU: 0 PID: 218 at arch/x86/kernel/fpu/core.c:57 x86_task_fpu+0x17/0x20

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://bugzilla.kernel.org/show_bug.cgi?id=218980

            Bug ID: 218980
           Summary: [VM boot] Guest Kernel hit BUG: kernel NULL pointer
                    dereference, address: 0000000000000010 and WARNING:
                    CPU: 0 PID: 218 at arch/x86/kernel/fpu/core.c:57
                    x86_task_fpu+0x17/0x20
           Product: Virtualization
           Version: unspecified
          Hardware: All
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P3
         Component: kvm
          Assignee: virtualization_kvm@xxxxxxxxxxxxxxxxxxxx
          Reporter: hongyu.ning@xxxxxxxxx
        Regression: No

Created attachment 306485
  --> https://bugzilla.kernel.org/attachment.cgi?id=306485&action=edit
WARNING/BUG and Call Trace info in dmesg

in an regular linux-next guest kernel regression test setup, recently hit
following BUG and WARNING, likely related to x86/fpu.

--Test Setup--
KVM+QEMU environment to boot normal VM with latest linux-next guest kernel

--Kconfig of Guest Kernel Compile--
refer to attachment

--Error Kernel Log--
(full log refer to attachment)
[    1.958885] ------------[ cut here ]------------
[    1.958928] WARNING: CPU: 0 PID: 218 at arch/x86/kernel/fpu/core.c:57
x86_task_fpu+0x17/0x20
[    1.959004] Modules linked in:
[    1.959034] CPU: 0 PID: 218 Comm: rpcbind Not tainted
6.10.0-rc4-00234-g859e6ded5e41 #1
[    1.959092] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown
2/2/2022
[    1.959147] RIP: 0010:x86_task_fpu+0x17/0x20
[    1.959185] Code: 40 01 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
0f 1f 44 00 00 f6 47 2e 20 48 8d 87 00 25 00 00 75 05 c3 cc cc cc cc <0f> 0b 31
c0 c3 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90
[    1.959303] RSP: 0000:ffa00000009cfe08 EFLAGS: 00010202
[    1.959340] RAX: ff1100000bcad480 RBX: 0000000000000008 RCX:
0000000000244000
[    1.959394] RDX: 0000000000242000 RSI: ffffffff811368bd RDI:
ff1100000bcaaf80
[    1.959447] RBP: ff1100000bcaaf80 R08: 0000000000000000 R09:
0000000000000000
[    1.959501] R10: ff11000001835a00 R11: 0000000000000300 R12:
ff1100003d231240
[    1.959555] R13: 0000000000000007 R14: 0000000000000000 R15:
ff1100003d2312b8
[    1.959611] FS:  00007fa0908c7dc0(0000) GS:ff1100003d200000(0000)
knlGS:0000000000000000
[    1.959666] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.959715] CR2: 00007f938c6c0a56 CR3: 000000000c926006 CR4:
0000000000771ef0
[    1.959772] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[    1.959826] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7:
0000000000000400
[    1.959886] PKRU: 55555554
[    1.959908] Call Trace:
[    1.959931]  <TASK>
[    1.959953]  ? __warn+0x80/0x120
[    1.959993]  ? x86_task_fpu+0x17/0x20
[    1.960024]  ? report_bug+0x1c3/0x1d0
[    1.960058]  ? handle_bug+0x3c/0x70
[    1.960093]  ? exc_invalid_op+0x14/0x70
[    1.960123]  ? asm_exc_invalid_op+0x16/0x20
[    1.960163]  ? free_task+0x2d/0x70
[    1.960197]  ? x86_task_fpu+0x17/0x20
[    1.960228]  arch_release_task_struct+0x27/0x30
[    1.960272]  free_task+0x35/0x70
[    1.960303]  rcu_do_batch+0x1a5/0x460
[    1.960343]  ? rcu_do_batch+0x13b/0x460
[    1.960374]  ? timerqueue_add+0x9b/0xc0
[    1.960409]  rcu_core+0x148/0x300
[    1.960441]  handle_softirqs+0xfa/0x2f0
[    1.960481]  irq_exit_rcu+0x7a/0xc0
[    1.960513]  sysvec_apic_timer_interrupt+0x53/0xd0
[    1.960552]  asm_sysvec_apic_timer_interrupt+0x16/0x20
[    1.960591] RIP: 0033:0x7fa090d7c2c1
[    1.960628] Code: 83 c4 01 48 89 c2 0f b7 04 41 f6 c4 20 75 ea 84 d2 74 32
4c 89 e2 eb 12 90 48 0f be 42 01 48 83 c2 01 84 c0 74 73 0f b7 04 41 <f6> c4 20
74 ea 49 39 d4 72 55 0f b6 02 49 89 d4 84 c0 74 07 c6 02
[    1.960745] RSP: 002b:00007ffe6875df10 EFLAGS: 00000246
[    1.960783] RAX: 000000000000c608 RBX: 00007ffe6875e0f0 RCX:
00007fa090d9f3c0
[    1.960837] RDX: 00007ffe6875e5da RSI: 00007ffe6875e600 RDI:
1999999999999999
[    1.960893] RBP: 00007ffe6875e5d0 R08: ffffffffffffff00 R09:
0000000000000000
[    1.960945] R10: 00007fa090d9eac0 R11: 00007fa090d9f3c0 R12:
00007ffe6875e5da
[    1.960999] R13: 00007ffe6875e9c0 R14: 00007ffe6875e600 R15:
00007ffe6875e5c0
[    1.961054]  </TASK>
[    1.961075] ---[ end trace 0000000000000000 ]---
[    1.961114] BUG: kernel NULL pointer dereference, address: 0000000000000010
[    1.961158] #PF: supervisor read access in kernel mode
[    1.961196] #PF: error_code(0x0000) - not-present page
[    1.961235] PGD bff8067 P4D 0
[    1.961266] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
[    1.961304] CPU: 0 PID: 218 Comm: rpcbind Tainted: G        W         
6.10.0-rc4-00234-g859e6ded5e41 #1
[    1.961366] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown
2/2/2022
[    1.961412] RIP: 0010:fpstate_free+0x5/0x30
[    1.961441] Code: 41 5c 41 5d 41 5e c3 cc cc cc cc 66 2e 0f 1f 84 00 00 00
00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 <4c> 8b 47
10 4d 85 c0 74 11 48 83 c7 40 49 39 f8 74 08 4c 89 c7 e9
[    1.961552] RSP: 0000:ffa00000009cfe10 EFLAGS: 00010246
[    1.961589] RAX: 0000000000000000 RBX: 0000000000000008 RCX:
0000000000244000
[    1.961646] RDX: 0000000000242000 RSI: ffffffff811368bd RDI:
0000000000000000
[    1.961700] RBP: ff1100000bcaaf80 R08: 0000000000000000 R09:
0000000000000000
[    1.961755] R10: ff11000001835a00 R11: 0000000000000300 R12:
ff1100003d231240
[    1.961802] R13: 0000000000000007 R14: 0000000000000000 R15:
ff1100003d2312b8
[    1.961849] FS:  00007fa0908c7dc0(0000) GS:ff1100003d200000(0000)
knlGS:0000000000000000
[    1.961911] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.961957] CR2: 0000000000000010 CR3: 000000000c926006 CR4:
0000000000771ef0
[    1.962005] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[    1.962056] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7:
0000000000000400
[    1.962102] PKRU: 55555554
[    1.962118] Call Trace:
[    1.962135]  <TASK>
[    1.962155]  ? __die+0x20/0x70
[    1.962186]  ? page_fault_oops+0x80/0x150
[    1.962216]  ? do_user_addr_fault+0x5f/0x680
[    1.962254]  ? kvm_read_and_reset_apf_flags+0x45/0x60
[    1.962292]  ? exc_page_fault+0x64/0x140
[    1.962322]  ? asm_exc_page_fault+0x22/0x30
[    1.962352]  ? free_task+0x2d/0x70
[    1.962383]  ? fpstate_free+0x5/0x30
[    1.962415]  free_task+0x35/0x70
[    1.962446]  rcu_do_batch+0x1a5/0x460
[    1.962478]  ? rcu_do_batch+0x13b/0x460
[    1.962510]  ? timerqueue_add+0x9b/0xc0
[    1.962540]  rcu_core+0x148/0x300
[    1.962568]  handle_softirqs+0xfa/0x2f0
[    1.962598]  irq_exit_rcu+0x7a/0xc0
[    1.962632]  sysvec_apic_timer_interrupt+0x53/0xd0
[    1.962670]  asm_sysvec_apic_timer_interrupt+0x16/0x20
[    1.962709] RIP: 0033:0x7fa090d7c2c1
[    1.962740] Code: 83 c4 01 48 89 c2 0f b7 04 41 f6 c4 20 75 ea 84 d2 74 32
4c 89 e2 eb 12 90 48 0f be 42 01 48 83 c2 01 84 c0 74 73 0f b7 04 41 <f6> c4 20
74 ea 49 39 d4 72 55 0f b6 02 49 89 d4 84 c0 74 07 c6 02
[    1.962862] RSP: 002b:00007ffe6875df10 EFLAGS: 00000246
[    1.962899] RAX: 000000000000c608 RBX: 00007ffe6875e0f0 RCX:
00007fa090d9f3c0
[    1.962946] RDX: 00007ffe6875e5da RSI: 00007ffe6875e600 RDI:
1999999999999999
[    1.962992] RBP: 00007ffe6875e5d0 R08: ffffffffffffff00 R09:
0000000000000000
[    1.963045] R10: 00007fa090d9eac0 R11: 00007fa090d9f3c0 R12:
00007ffe6875e5da
[    1.963090] R13: 00007ffe6875e9c0 R14: 00007ffe6875e600 R15:
00007ffe6875e5c0
[    1.963137]  </TASK>
[    1.963158] Modules linked in:
[    1.963189] CR2: 0000000000000010
[    1.963220] ---[ end trace 0000000000000000 ]---
[    1.967997] RIP: 0010:fpstate_free+0x5/0x30
[    1.968033] Code: 41 5c 41 5d 41 5e c3 cc cc cc cc 66 2e 0f 1f 84 00 00 00
00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 <4c> 8b 47
10 4d 85 c0 74 11 48 83 c7 40 49 39 f8 74 08 4c 89 c7 e9
[    1.968146] RSP: 0000:ffa00000009cfe10 EFLAGS: 00010246
[    1.968183] RAX: 0000000000000000 RBX: 0000000000000008 RCX:
0000000000244000
[    1.968232] RDX: 0000000000242000 RSI: ffffffff811368bd RDI:
0000000000000000
[    1.968287] RBP: ff1100000bcaaf80 R08: 0000000000000000 R09:
0000000000000000
[    1.968335] R10: ff11000001835a00 R11: 0000000000000300 R12:
ff1100003d231240
[    1.968389] R13: 0000000000000007 R14: 0000000000000000 R15:
ff1100003d2312b8
[    1.968436] FS:  00007fa0908c7dc0(0000) GS:ff1100003d200000(0000)
knlGS:0000000000000000
[    1.968490] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.968535] CR2: 0000000000000010 CR3: 000000000c926006 CR4:
0000000000771ef0
[    1.968587] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[    1.968641] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7:
0000000000000400
[    1.968688] PKRU: 55555554
[    1.968705] Kernel panic - not syncing: Fatal exception in interrupt
[    1.968769] Kernel Offset: disabled
[    1.973595] ---[ end Kernel panic - not syncing: Fatal exception in
interrupt ]---

--Notes on Issue Observed--
a. issue seen since linux-next-6.10-rc3-240611
b. issue reproduced on
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master branch
c. issue hit around 90% of VM booting cycles, not 100% reproducible
d. git bisect points to
https://lore.kernel.org/all/20240605083557.2051480-4-mingo@xxxxxxxxxx as bad
commit
e. attachment info:
e1: kconfig.config -> guest kernel kconfig
e2: vm_boot_null_pointer_panic_and_fpu_warning.log -> full guest kernel booting
log when issue hit
e3: vm_boot_pass.log -> full guest kernel booting log when no issue hit

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux