On 20/12/19 20:26, John Andersen wrote: > Paravirtualized CR pinning will likely be incompatible with kexec for > the foreseeable future. Early boot code could possibly be changed to > not clear protected bits. However, a kernel that requests CR bits be > pinned can't know if the kernel it's kexecing has been updated to not > clear protected bits. This would result in the kernel being kexec'd > almost immediately receiving a general protection fault. > > Security conscious kernel configurations disable kexec already, per KSPP > guidelines. Projects such as Kata Containers, AWS Lambda, ChromeOS > Termina, and others using KVM to virtualize Linux will benefit from > this protection. > > The usage of SMM in SeaBIOS was explored as a way to communicate to KVM > that a reboot has occurred and it should zero the pinned bits. When > using QEMU and SeaBIOS, SMM initialization occurs on reboot. However, > prior to SMM initialization, BIOS writes zero values to CR0, causing a > general protection fault to be sent to the guest before SMM can signal > that the machine has booted. SMM is optional; I think it makes sense to leave it to userspace to reset pinning (including for the case of triple faults), while INIT which is handled within KVM would keep it active. > Pinning of sensitive CR bits has already been implemented to protect > against exploits directly calling native_write_cr*(). The current > protection cannot stop ROP attacks which jump directly to a MOV CR > instruction. Guests running with paravirtualized CR pinning are now > protected against the use of ROP to disable CR bits. The same bits that > are being pinned natively may be pinned via the CR pinned MSRs. These > bits are WP in CR0, and SMEP, SMAP, and UMIP in CR4. > > Future patches could protect bits in MSRs in a similar fashion. The NXE > bit of the EFER MSR is a prime candidate. Please include patches for either kvm-unit-tests or tools/testing/selftests/kvm that test the functionality. Paolo