On Wed, Jan 6, 2021 at 1:26 AM Maciej S. Szmigiero <maciej.szmigiero@xxxxxxxxxx> wrote: > > Thanks for looking at it Ben. > > On 06.01.2021 00:38, Ben Gardon wrote: > (..) > > > > +Sean Christopherson, for whom I used a stale email address. > > . > > I tested this series by running kvm-unit-tests on an Intel Skylake > > machine. It did not introduce any new failures. I also ran the > > set_memory_region_test > > It's "memslot_move_test" that is crashing the kernel - a memslot > move test based on "set_memory_region_test". I apologize if I'm being very dense, but I can't find this test anywhere. Is this something you have in-house but haven't upstreamed or just the test_move_memory_region(); testcase from set_memory_region_test? I have a similar memslot-moving-stress-test in the pipeline I need to send out, but I didn't think such a test existed yet and my test hadn't caught this issue. > > >, but was unable to reproduce Maciej's problem. > > Maciej, if you'd be willing to confirm this series solves the problem > > you observed, or provide more details on the setup in which you > > observed it, I'd appreciate it. > > > > I've applied your patches and now are getting a slightly > different backtrace for the same test: > [ 534.768212] general protection fault, probably for non-canonical address 0xdead000000000100: 0000 [#1] SMP PTI > [ 534.887969] CPU: 97 PID: 4651 Comm: memslot_move_te Not tainted 5.11.0-rc2+ #81 > [ 534.975465] Hardware name: Oracle Corporation ORACLE SERVER X7-2c/SERVER MODULE ASSY, , BIOS 46070300 12/20/2019 > [ 535.097288] RIP: 0010:kvm_tdp_mmu_zap_gfn_range+0x70/0xb0 [kvm] > [ 535.168199] Code: b8 01 00 00 00 4c 89 f1 41 89 45 50 4c 89 ee 48 89 df e8 a3 f3 ff ff 41 09 c4 41 83 6d 50 01 74 13 4d 8b 6d 00 4d 39 fd 74 1e <41> 8b 45 50 85 c0 75 c6 0f 0b 4c 89 ee 48 89 df e8 0b fc ff ff 4d > [ 535.393005] RSP: 0018:ffffbded19083b90 EFLAGS: 00010297 > [ 535.455533] RAX: 0000000000000001 RBX: ffffbded1a27d000 RCX: 000000008030000e > [ 535.540945] RDX: 000000008030000f RSI: ffffffffc0ad5453 RDI: ffff9cd72a00d300 > [ 535.626358] RBP: ffffbded19083bc0 R08: 0000000000000001 R09: ffffffffc0ad5400 > [ 535.711769] R10: ffff9d370acf31b8 R11: 0000000000000001 R12: 0000000000000001 > [ 535.797181] R13: dead000000000100 R14: 0000000400000000 R15: ffffbded1a292418 > [ 535.882590] FS: 00007ff50312e740(0000) GS:ffff9d947fb40000(0000) knlGS:0000000000000000 > [ 535.979443] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 536.048211] CR2: 0000000001e02fe0 CR3: 00000060a78e8003 CR4: 00000000007726e0 > [ 536.133628] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 536.219043] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 536.304452] PKRU: 55555554 > [ 536.336813] Call Trace: > [ 536.366057] kvm_tdp_mmu_zap_all+0x26/0x40 [kvm] > [ 536.421357] kvm_mmu_zap_all_fast+0x167/0x180 [kvm] > [ 536.479767] kvm_mmu_invalidate_zap_pages_in_memslot+0xe/0x10 [kvm] > [ 536.554817] kvm_page_track_flush_slot+0x5a/0x90 [kvm] > [ 536.616344] kvm_arch_flush_shadow_memslot+0xe/0x10 [kvm] > [ 536.680986] kvm_set_memslot+0x18f/0x690 [kvm] > [ 536.734186] __kvm_set_memory_region+0x41f/0x580 [kvm] > [ 536.795705] kvm_set_memory_region+0x2b/0x40 [kvm] > [ 536.853062] kvm_vm_ioctl+0x216/0x1060 [kvm] > [ 536.904182] ? irqtime_account_irq+0x40/0xc0 > [ 536.955270] ? irq_exit_rcu+0x55/0xf0 > [ 536.999079] ? sysvec_apic_timer_interrupt+0x45/0x90 > [ 537.058485] ? asm_sysvec_apic_timer_interrupt+0x12/0x20 > [ 537.122058] ? __audit_syscall_entry+0xdd/0x130 > [ 537.176267] __x64_sys_ioctl+0x92/0xd0 > [ 537.221114] do_syscall_64+0x37/0x50 > [ 537.263878] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 537.324324] RIP: 0033:0x7ff502a27307 > [ 537.367882] Code: 44 00 00 48 8b 05 69 1b 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 39 1b 2d 00 f7 d8 64 89 01 48 > [ 537.594221] RSP: 002b:00007fffde6b2d38 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 > [ 537.685616] RAX: ffffffffffffffda RBX: 0000000001de8000 RCX: 00007ff502a27307 > [ 537.771797] RDX: 0000000001e02fe0 RSI: 000000004020ae46 RDI: 0000000000000004 > [ 537.857967] RBP: 00000000000001fc R08: 00007fffde74b090 R09: 000000000005af86 > [ 537.944110] R10: 000000000005af86 R11: 0000000000000246 R12: 0000000050000000 > [ 538.030236] R13: 0000000000000000 R14: 0000000000000000 R15: 00000000000001fb > [ 538.116345] Modules linked in: kvm_intel kvm xt_comment xt_owner ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcrdma ib_isert iscsi_target_mod ib_iser ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_umad iw_cxgb4 rdma_cm iw_cm ib_cm intel_rapl_msr intel_rapl_common skx_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp mgag200 drm_kms_helper iTCO_wdt bnxt_re cec iTCO_vendor_support drm ib_uverbs syscopyarea sysfillrect ib_core sg irqbypass sysimgblt pcspkr ioatdma i2c_i801 fb_sys_fops joydev lpc_ich intel_pch_thermal i2c_smbus i2c_algo_bit dca ip_tables vfat fat xfs sd_mod t10_pi be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls cxgb3i cxgb3 mdio libcxgbi > [ 538.116423] libcxgb qla4xxx iscsi_boot_sysfs crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper bnxt_en wmi sunrpc dm_mirror dm_region_hash dm_log dm_mod iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi [last unloaded: kvm] > [ 539.450863] ---[ end trace 7c17f445a2093145 ]--- > [ 539.623473] RIP: 0010:kvm_tdp_mmu_zap_gfn_range+0x70/0xb0 [kvm] > [ 539.695136] Code: b8 01 00 00 00 4c 89 f1 41 89 45 50 4c 89 ee 48 89 df e8 a3 f3 ff ff 41 09 c4 41 83 6d 50 01 74 13 4d 8b 6d 00 4d 39 fd 74 1e <41> 8b 45 50 85 c0 75 c6 0f 0b 4c 89 ee 48 89 df e8 0b fc ff ff 4d > [ 539.921479] RSP: 0018:ffffbded19083b90 EFLAGS: 00010297 > [ 539.984788] RAX: 0000000000000001 RBX: ffffbded1a27d000 RCX: 000000008030000e > [ 540.070982] RDX: 000000008030000f RSI: ffffffffc0ad5453 RDI: ffff9cd72a00d300 > [ 540.157173] RBP: ffffbded19083bc0 R08: 0000000000000001 R09: ffffffffc0ad5400 > [ 540.243372] R10: ffff9d370acf31b8 R11: 0000000000000001 R12: 0000000000000001 > [ 540.329567] R13: dead000000000100 R14: 0000000400000000 R15: ffffbded1a292418 > [ 540.415772] FS: 00007ff50312e740(0000) GS:ffff9d947fb40000(0000) knlGS:0000000000000000 > [ 540.513427] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 540.583005] CR2: 0000000001e02fe0 CR3: 00000060a78e8003 CR4: 00000000007726e0 > [ 540.669228] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 540.755448] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 540.841659] PKRU: 55555554 > [ 540.874826] Kernel panic - not syncing: Fatal exception > [ 540.938269] Kernel Offset: 0xe200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) > [ 542.097054] ---[ end Kernel panic - not syncing: Fatal exception ]--- > > The code that is crashing is: > # arch/x86/kvm/mmu/mmu_internal.h:100: BUG_ON(!sp->root_count); > movl 80(%r13), %eax # MEM[(int *)__mptr_14 + 80B], _17 <- here > testl %eax, %eax # _17 > jne .L421 #, > > So it looks like it now crashes in the same BUG_ON() but when trying to > deference the "dead" sp pointer instead. Hmm thanks for testing the patches, I'll take another try at reproducing the issue and amend the commits. > > It's bad that you can't reproduce the issue, however, as this would > probably make the root causing process much more effective. > Are you testing on bare metal like me or while running nested? I'm running on bare metal too. > > My test machine has Xeon Platinum 8167M CPU, so it's a Skylake, too. > It has 754G RAM + 8G swap, running just the test program. > > I've uploaded the kernel that I've used for testing here: > https://github.com/maciejsszmigiero/linux/tree/tdp_mmu_bug > > It is basically a 5.11.0-rc2 kernel with > "KVM: x86/mmu: Bug fixes and cleanups in get_mmio_spte()" series and > your fixes applied on top of it. > > In addition to that, I've updated > https://gist.github.com/maciejsszmigiero/890218151c242d99f63ea0825334c6c0 > with the kernel .config file that was used. > > The compiler that I've used to compile the test kernel was: > "gcc version 8.3.1 20190311 (Red Hat 8.3.1-3.2.0.1)" Thank you for the details, hopefully those can shed some light on why I wasn't able to reproduce the issue. > > Thanks, > Maciej