On Mon, Aug 21, 2017 at 09:26:57AM +0800, Wanpeng Li wrote: > 2017-08-21 7:13 GMT+08:00 Adam Borowski <kilobyte@xxxxxxxxxx>: > > Hi! > > I'm afraid I keep getting a quite reliable, but random, splat when running > > KVM: > > I reported something similar before. https://lkml.org/lkml/2017/6/29/64 Your problem seems to require OOM; I don't have any memory pressure at all: running a single 2GB guest while there's nothing big on the host (bloatfox, xfce, xorg, terminals + some minor junk); 8GB + (untouched) swap. There's no memory pressure inside the guest either -- none was Linux (I wanted to test something on hurd, kfreebsd) and I doubt they even got to use all of their frames. Also, it doesn't reproduce for me on 4.12. > > ------------[ cut here ]------------ > > WARNING: CPU: 5 PID: 5826 at arch/x86/kvm/mmu.c:717 mmu_spte_clear_track_bits+0x123/0x170 > > Modules linked in: tun nbd arc4 rtl8xxxu mac80211 cfg80211 rfkill nouveau video ttm > > CPU: 5 PID: 5826 Comm: qemu-system-x86 Not tainted 4.13.0-rc5-vanilla-ubsan-00211-g7f680d7ec315 #1 > > Hardware name: System manufacturer System Product Name/M4A77T, BIOS 2401 05/18/2011 > > task: ffff880207ef0400 task.stack: ffffc900035e4000 > > RIP: 0010:mmu_spte_clear_track_bits+0x123/0x170 > > RSP: 0018:ffffc900035e7ab0 EFLAGS: 00010246 > > RAX: 0000000000000000 RBX: 000000010501cc67 RCX: 0000000000000001 > > RDX: dead0000000000ff RSI: ffff88020e501df8 RDI: 0000000004140700 > > RBP: ffffc900035e7ad8 R08: 0000000000000100 R09: 0000000000000003 > > R10: 0000000000000003 R11: 0000000000000005 R12: 000000000010501c > > R13: ffffea0004140700 R14: ffff88020e1d0000 R15: 0000000000000000 > > FS: 00007f0213fbd700(0000) GS:ffff88022fd40000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 0000000000000000 CR3: 000000022187f000 CR4: 00000000000006e0 > > Call Trace: > > drop_spte+0x26/0x130 > > mmu_page_zap_pte+0xc4/0x160 > > kvm_mmu_prepare_zap_page+0x65/0x660 > > kvm_mmu_invalidate_zap_all_pages+0xc5/0x1f0 > > kvm_mmu_invalidate_zap_pages_in_memslot+0x9/0x10 > > kvm_page_track_flush_slot+0x86/0xd0 > > kvm_arch_flush_shadow_memslot+0x9/0x10 > > __kvm_set_memory_region+0x8fb/0x14f0 > > kvm_set_memory_region+0x2f/0x50 > > kvm_vm_ioctl+0x559/0xcc0 > > ? kvm_vcpu_ioctl+0x171/0x620 > > ? __switch_to+0x30b/0x740 > > do_vfs_ioctl+0xbb/0x8d0 > > ? find_vma+0x23/0x100 > > ? __fget_light+0x94/0x110 > > SyS_ioctl+0x86/0xa0 > > entry_SYSCALL_64_fastpath+0x17/0x98 > > RIP: 0033:0x7f021c80ddc7 > > RSP: 002b:00007f0213fbc518 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 > > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f021c80ddc7 > > RDX: 00007f0213fbc5b0 RSI: 000000004020ae46 RDI: 000000000000000a > > RBP: 0000000000000000 R08: 00007f020c1698a0 R09: 0000000000000000 > > R10: 00007f020c1698a0 R11: 0000000000000246 R12: 0000000000000006 > > R13: 00007f022201c000 R14: 0000000000000002 R15: 0000558c3899e550 > > Code: ae fc 01 48 85 c0 75 1c 4c 89 e7 e8 98 de fd ff 48 8b 05 81 ae fc 01 48 85 c0 74 ba 48 85 c3 0f 95 c3 eb b8 48 85 c3 74 e7 eb dd <0f> ff eb 97 4c 89 e7 66 0f 1f 44 00 00 e8 6b de fd ff eb 97 31 > > ---[ end trace 16c196134f0dd0a9 ]--- > > > > After this, there are hundreds of repeats and lots of secondary damage which > > kills the host quickly. > > > > Usually this happens within a few minutes, but sometimes it takes ~half an > > hour to reproduce. Because of this, it'd be unpleasant to bisect -- is this > > problem already known? -- ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!? ⢿⡄⠘⠷⠚⠋⠀ -- Genghis Ht'rok'din ⠈⠳⣄⠀⠀⠀⠀