https://bugzilla.kernel.org/show_bug.cgi?id=219009 Bug ID: 219009 Summary: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) Product: Virtualization Version: unspecified Hardware: AMD OS: Linux Status: NEW Severity: high Priority: P3 Component: kvm Assignee: virtualization_kvm@xxxxxxxxxxxxxxxxxxxx Reporter: zaltys@xxxxxxxxx Regression: No Running nested VMs on AMD Ryzen 7000/8000 (ZEN4) CPUs results in random host's reboots. There is no kernel panic, no log entries, no relevant output to serial console. It is as if platform is simply hard reset. It seems time to reproduce it varies from system to system and can be dependent on workload and even specific CPU model. I can reproduce it with kernel 6.9.7 and qemu 9.0 on Ryzen 7950X3D under one hour by using KVM -> Windows 10/11 with Hyper-V services on or KVM -> Windows 10/11 with 3 VBox VMs (also Win11) running. Others people had it repeatedly reproduced on Ryzen 7700,7600 and 8700GE, including KVM -> KVM -> Linux.[1] I also have seen Hetzner (company offering Ryzen based dedicated servers) customers complaining about similiar random reboots. I tried looking up errata for Ryzen 7000/8000, but could not find one published, so I decided to check errata for EPYC 9004 [2], which is also Zen4 arch as Ryzen 7000/8000. It has nesting related bug #1495 (on page 49), which mentions using Virtualized VMLOAD/VMSAVE can result in MCE and/or system reset. Based on that errata mentioned above, I reconfigured my system with kvm_amd.vls=0 and for me random reboots with nested virtualization stopped. Same was reported by several people from [1]. Somebody from AMD must be asked to confirm if it is really Ryzen 7000/8000 hardware bug, and if there is a better fix than disabling VLS as it has performance hit. If disabling it is the only fix, then kvm_amd.vls=0 must be default for Ryzen 7000/8000. [1] https://www.reddit.com/r/Proxmox/comments/1cym3pl/nested_virtualization_crashing_ryzen_7000_series/ [2] https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/revision-guides/57095-PUB_1_01.pdf -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.