On Mon, 2019-12-23 at 16:48 +0200, Liran Alon wrote: > > On 20 Dec 2019, at 21:26, John Andersen <john.s.andersen@xxxxxxxxx> > > wrote: > > > > Pinning is not active when running in SMM. Entering SMM disables > > pinned > > bits, writes to control registers within SMM would therefore > > trigger > > general protection faults if pinning was enforced. > > For compatibility reasons, it’s reasonable that pinning won’t be > active when running in SMM. > However, I do think we should not allow vSMM code to change pinned > values when returning back from SMM. > This would prevent a vulnerable vSMI handler from modifying vSMM > state-area to modify CR4 when running outside of vSMM. > I believe in this case it’s legit to just forcibly restore original > CR0/CR4 pinned values. Ignoring vSMM changes. > In em_rsm could we just OR with the value of the PINNED MSRs right before the final return? > > The guest may never read pinned bits. If an attacker were to read > > the > > CR pinned MSRs, they might decide to preform another attack which > > would > > not cause a general protection fault. > > I disagree with this statement. > An attacker knows what is the system it is attacking and can deduce > by that which bits it pinned… > Therefore, protecting from guest reading these is not important at > all. > Sure, I'll make it readable. > > Should userspace expose the CR pining CPUID feature bit, it must > > zero CR > > pinned MSRs on reboot. If it does not, it runs the risk of having > > the > > guest enable pinning and subsequently cause general protection > > faults on > > next boot due to early boot code setting control registers to > > values > > which do not contain the pinned bits. > > Why reset CR pinned MSRs by userspace instead of KVM INIT handling? > > > When running with KVM guest support and paravirtualized CR pinning > > enabled, paravirtualized and existing pinning are setup at the same > > point on the boot CPU. Non-boot CPUs setup pinning upon > > identification. > > > > Guests using the kexec system call currently do not support > > paravirtualized control register pinning. This is due to early boot > > code writing known good values to control registers, these values > > do > > not contain the protected bits. This is due to CPU feature > > identification being done at a later time, when the kernel properly > > checks if it can enable protections. > > > > Most distributions enable kexec. However, kexec could be made boot > > time > > disableable. In this case if a user has disabled kexec at boot time > > the guest will request that paravirtualized control register > > pinning > > be enabled. This would expand the userbase to users of major > > distributions. > > > > Paravirtualized CR pinning will likely be incompatible with kexec > > for > > the foreseeable future. Early boot code could possibly be changed > > to > > not clear protected bits. However, a kernel that requests CR bits > > be > > pinned can't know if the kernel it's kexecing has been updated to > > not > > clear protected bits. This would result in the kernel being kexec'd > > almost immediately receiving a general protection fault. > > Instead of disabling kexec entirely, I think it makes more sense to > invent > some generic mechanism in which new kernel can describe to old kernel > a set of flags that specifies which features hand-over it supports. > One of them > being pinned CRs. > > For example, isn’t this also relevant for IOMMU DMA protection? > i.e. Doesn’t old kernel need to know if it should disable or enable > IOMMU DMAR > before kexec to new kernel? Similar to EDK2 IOMMU DMA protection > hand-over? Great idea. Making kexec work will require changes to these files and maybe more: arch/x86/boot/compressed/head_64.S arch/x86/kernel/head_64.S arch/x86/kernel/relocate_kernel_64.S Which my previous attempts showed different results when running virtualized vs. unvirtualized. Specificity different behavior with SMAP and UMIP bits. This would be a longer process though. As validating that everything still works in both the VM and on physical hosts will be required. As it stands this patchset could pick up a fairly large userbase via the virtualized container projects. Should we pursue kexec in this patchset or a later one? Thanks, John