On 23/08/2017 22:39, Jeff Cook wrote: > Hi all, > > I've seen several posts in the list that seem to have only experienced > this warning on 4.13: > > [94415.028749] WARNING: CPU: 30 PID: 18142 at arch/x86/kvm/mmu.c:717 > mmu_spte_clear_track_bits+0xf0/0x100 [kvm] > [94415.028753] Modules linked in: rpcsec_gss_krb5 auth_rpcgss > oid_registry nfsv4 dns_resolver nfs lockd grace sunrpc fscache vhost_net > vhost tap xt_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT > nf_reject_ipv4 xt_tcpudp tun ebtable_filter ebtables ip6table_filter > ip6_tables iptable_filter msr ipt_MASQUERADE nf_nat_masquerade_ipv4 > iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat > nf_conntrack zfs(PO) zunicode(PO) uvcvideo intel_rapl zavl(PO) > nls_iso8859_1 nls_cp437 icp(PO) videobuf2_vmalloc videobuf2_memops > input_leds led_class videobuf2_v4l2 mousedev joydev videobuf2_core > x86_pkg_temp_thermal intel_powerclamp videodev hid_logitech_hidpp > coretemp snd_usb_audio snd_usbmidi_lib crct10dif_pclmul media uas > zcommon(PO) znvpair(PO) crc32_pclmul hid_logitech_dj drm_kms_helper > hid_generic ghash_clmulni_intel > [94415.028791] pcbc aesni_intel drm aes_x86_64 crypto_simd > snd_hda_codec_realtek iTCO_wdt iTCO_vendor_support snd_hda_codec_generic > glue_helper mac_hid mxm_wmi evdev cryptd snd_hda_intel syscopyarea > snd_virtuoso snd_hda_codec sysfillrect snd_oxygen_lib sysimgblt > intel_cstate snd_mpu401_uart fb_sys_fops snd_hda_core intel_rapl_perf > snd_rawmidi snd_seq_device igb snd_hwdep snd_pcm snd_timer ptp snd > pps_core soundcore i2c_algo_bit pcspkr ioatdma i2c_i801 shpchp dca > lpc_ich bridge tpm_tis tpm_tis_core tpm stp llc acpi_power_meter wmi > button sch_fq_codel kvm_intel kvm sg ip_tables x_tables usbhid hid > usb_storage sr_mod cdrom spl(O) vfio_pci irqbypass vfio_virqfd > vfio_iommu_type1 vfio vfat fat ext4 crc16 mbcache sd_mod jbd2 fscrypto > dm_thin_pool dm_cache dm_persistent_data dm_bio_prison dm_bufio dm_raid > [94415.028832] raid456 libcrc32c crc32c_generic async_raid6_recov > async_memcpy async_pq async_xor xor async_tx crc32c_intel ahci xhci_pci > libahci xhci_hcd libata usbcore scsi_mod usb_common raid6_pq dm_mod dax > raid1 md_mod > [94415.028844] CPU: 30 PID: 18142 Comm: qemu-system-x86 Tainted: P B > W O 4.13.0-rc5-g58d4e450a490 #1 > [94415.028848] Hardware name: Supermicro SYS-7038A-I/X10DAI, BIOS 2.0a > 11/09/2016 > [94415.028852] task: ffff8804946a1d80 task.stack: ffffc9000eedc000 > [94415.028858] RIP: 0010:mmu_spte_clear_track_bits+0xf0/0x100 [kvm] > [94415.028860] RSP: 0018:ffffc9000eedfb18 EFLAGS: 00010246 > [94415.028863] RAX: 0000000000000000 RBX: 00000002428a6f77 RCX: > dead0000000000ff > [94415.028866] RDX: 0000000000000000 RSI: ffff88015d34c670 RDI: > ffffea00090a2980 > [94415.028868] RBP: ffffc9000eedfb30 R08: 0000000000000101 R09: > ffff8802eda50218 > [94415.028871] R10: ffff8802eda50008 R11: ffff8802eda50000 R12: > 00000000002428a6 > [94415.028874] R13: ffff8803f6280000 R14: ffff88015d2dac78 R15: > 0000000000000000 > [94415.028892] FS: 00007f43a6d7a380(0000) GS:ffff88105d580000(0000) > knlGS:0000000000000000 > [94415.028896] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [94415.028899] CR2: 0000000000000000 CR3: 0000000495c69000 CR4: > 00000000003426e0 > [94415.028902] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [94415.028904] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: > 0000000000000400 > [94415.028907] Call Trace: > [94415.028913] drop_spte+0x1a/0xb0 [kvm] > [94415.028920] nmmu_page_zap_pte+0x9d/0xe0 [kvm] > [94415.028927] kvm_mmu_prepare_zap_page+0x65/0x320 [kvm] > [94415.028934] kvm_mmu_invalidate_zap_all_pages+0x10d/0x160 [kvm] > [94415.028941] kvm_mmu_invalidate_zap_pages_in_memslot+0xe/0x10 [kvm] > [94415.028949] kvm_page_track_flush_slot+0x59/0x80 [kvm] > [94415.028957] kvm_arch_flush_shadow_memslot+0xe/0x10 [kvm] > [94415.028982] __kvm_set_memory_region+0x807/0x8d0 [kvm] > [94415.029027] kvm_set_memory_region+0x2b/0x40 [kvm] > [94415.029039] kvm_vm_ioctl+0x496/0x860 [kvm] > [94415.029047] do_vfs_ioctl+0xa5/0x600 > [94415.029053] ? handle_mm_fault+0xde/0x1e0 > [94415.029058] ? __fget+0x6e/0x90 > [94415.029061] SyS_ioctl+0x79/0x90 > [94415.029066] entry_SYSCALL_64_fastpath+0x1a/0xa5 > [94415.029069] RIP: 0033:0x7f439fb678b7 > [94415.029072] RSP: 002b:00007fff950bc298 EFLAGS: 00000246 ORIG_RAX: > 0000000000000010 > [94415.029077] RAX: ffffffffffffffda RBX: 0000000000000500 RCX: > 00007f439fb678b7 > [94415.029080] RDX: 00007fff950bc330 RSI: 000000004020ae46 RDI: > 000000000000000e > [94415.029084] RBP: 00007f439459c190 R08: 0000000000000078 R09: > 000000000000000c > [94415.029087] R10: 00007f43a0dfa188 R11: 0000000000000246 R12: > 00007fff950bc150 > [94415.029090] R13: 0000000000000014 R14: 0000000000000000 R15: > 0000000000000015 > [94415.029094] Code: 5f 04 00 48 85 c0 75 1c 4c 89 e7 e8 9b 2d fe ff 48 > 8b 05 d4 5f 04 00 48 85 c0 74 be 48 85 c3 0f 95 c3 eb bc 48 85 c3 74 e7 > eb dd <0f> ff eb 9b 4c 89 e7 e8 74 2d fe ff eb a1 66 90 0f 1f 44 00 00 > [94415.029126] ---[ end trace b22f89e13b4d9fc8 ]--- > > I've received this issue on both 4.12 and 4.13. It appears more > frequently on 4.13, but also has a smaller overall negative system > impact. I've put additional information on the kernel bugzilla at > https://bugzilla.kernel.org/show_bug.cgi?id=196717. I suspect it > involves the access_dirty flag that was first enabled in 4.12. Are you using nested virtualization? If you are, it's possible, but if you are not, it cannot be that patch. You can also try easily by loading the kvm_intel module with eptad=0. Others mentioned transparent huge pages; can you reproduce it without THP in the host? Paolo