[Bug 196717] CPU: 0 PID: 5405 at arch/x86/kvm/mmu.c:717 mmu_spte_clear_track_bits+0xe7/0x100 [kvm]

bugzilla-daemon@xxxxxxxxxxxxxxxxxxx · Tue, 22 Aug 2017 15:03:54 +0000

https://bugzilla.kernel.org/show_bug.cgi?id=196717

Jeff Cook (jeff@xxxxxxxxxxx) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jeff@xxxxxxxxxxx

--- Comment #2 from Jeff Cook (jeff@xxxxxxxxxxx) ---
I've seen this on 4.12 and now on 4.13-rc6. On 4.12, the system-wide impact is
very significant and the guests and host slow down or stop responding all
together. On 4.13-rc6, only one guest crashes and the rest of the system seems
to continue to operate as expected.

I initially get

[68470.767034] ------------[ cut here ]------------
[68470.767064] WARNING: CPU: 30 PID: 239 at arch/x86/kvm/mmu.c:717
mmu_spte_clear_track_bits+0xf0/0x100 [kvm]
[...]
[68470.767237] CPU: 30 PID: 239 Comm: khugepaged Tainted: P           O   
4.13.0-rc6-g14ccee78fc82 #5
[68470.767246] Hardware name: Supermicro SYS-7038A-I/X10DAI, BIOS 2.0a
11/09/2016
[68470.767255] task: ffff88085b6bc9c0 task.stack: ffffc90006b80000
[68470.767267] RIP: 0010:mmu_spte_clear_track_bits+0xf0/0x100 [kvm]
[68470.767271] RSP: 0018:ffffc90006b83bd8 EFLAGS: 00010246
[68470.767275] RAX: 0000000000000000 RBX: 0000000117f13f77 RCX:
dead0000000000ff
[68470.767279] RDX: 0000000000000000 RSI: ffff8802caff9140 RDI:
ffffea00045fc4c0
[68470.767283] RBP: ffffc90006b83bf0 R08: 0000000000000001 R09:
0000000000000000
[68470.767287] R10: ffff8803ed9d0008 R11: ffff8803ed9d0000 R12:
0000000000117f13
[68470.767291] R13: ffff880401f10000 R14: ffffffffa07a91f0 R15:
ffff8803ed9d0008
[68470.767296] FS:  0000000000000000(0000) GS:ffff88105d580000(0000)
knlGS:0000000000000000
[68470.767302] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[68470.767306] CR2: 000000000d070004 CR3: 0000000001a09000 CR4:
00000000003426e0
[68470.767310] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[68470.767314] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[68470.767318] Call Trace:
[68470.767332]  drop_spte+0x1a/0xb0 [kvm]
[68470.767342]  kvm_zap_rmapp+0x3b/0x70 [kvm]
[68470.767352]  kvm_unmap_rmapp+0xe/0x20 [kvm]
[68470.767361]  kvm_handle_hva_range+0x139/0x1b0 [kvm]
[68470.767373]  kvm_unmap_hva_range+0x17/0x20 [kvm]
[68470.767382]  kvm_mmu_notifier_invalidate_range_start+0x52/0x90 [kvm]
[68470.767389]  __mmu_notifier_invalidate_range_start+0x55/0x80
[68470.767395]  khugepaged+0x1eb7/0x1ee0
[68470.767403]  ? wait_woken+0x80/0x80
[68470.767408]  kthread+0x125/0x140
[68470.767413]  ? khugepaged_scan_abort.part.6+0x60/0x60
[68470.767417]  ? kthread_create_on_node+0x70/0x70
[68470.767423]  ret_from_fork+0x25/0x30
[68470.767427] Code: 5f 04 00 48 85 c0 75 1c 4c 89 e7 e8 9b 2d fe ff 48 8b 05
d4 5f 04 00 48 85 c0 74 be 48 85 c3 0f 95 c3 eb bc 48 85 c3 74 e7 eb dd <0f> ff
eb 9b 4c 89 e7 e8 74 2d fe ff eb a1 66 90 0f 1f 44 00 00 
[68470.767463] ---[ end trace 249e3dbfe7765567 ]---
[68470.767478] ------------[ cut here ]------------

Followed by many messages like this:

68627.864783] BUG: Bad page state in process khugepaged  pfn:126aa9
[68627.864795] page:ffffea00049aaa40 count:0 mapcount:0 mapping:         
(null) index:0x1
[68627.864802] flags: 0x17fff0000000014(referenced|dirty)
[68627.864808] raw: 017fff0000000014 0000000000000000 0000000000000001
00000000ffffffff
[68627.864814] raw: dead000000000100 dead000000000200 0000000000000000
0000000000000000
[68627.864819] page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag set
[68627.864823] bad because of flags: 0x14(referenced|dirty)

Full dmesg on rc6 forthcoming.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.