On 09/25/2018 07:48 AM, Sean Christopherson wrote: > One defense against L1TF in KVM is to always set the upper five bits > of the *legal* physical address in the SPTEs for non-present and > reserved SPTEs, e.g. MMIO SPTEs. In the MMIO case, the GFN of the > MMIO SPTE may overlap with the upper five bits that are being usurped > to defend against L1TF. To preserve the GFN, the bits of the GFN that > overlap with the repurposed bits are shifted left into the reserved > bits, i.e. the GFN in the SPTE will be split into high and low parts. > When retrieving the GFN from the MMIO SPTE, e.g. to check for an MMIO > access, get_mmio_spte_gfn() unshifts the affected bits and restores > the original GFN for comparison. Unfortunately, get_mmio_spte_gfn() > neglects to mask off the reserved bits in the SPTE that were used to > store the upper chunk of the GFN. As a result, KVM fails to detect > MMIO accesses whose GPA overlaps the repurprosed bits, which in turn > causes guest panics and hangs. > > Fix the bug by generating a mask that covers the lower chunk of the > GFN, i.e. the bits that aren't shifted by the L1TF mitigation. The > alternative approach would be to explicitly zero the five reserved > bits that are used to store the upper chunk of the GFN, but that > requires additional run-time computation and makes an already-ugly > bit of code even more inscrutable. Thanks for debugging and fixing this. > > Reported-by: Sakari Ailus <sakari.ailus@xxxxxx> > Fixes: d9b47449c1a1 ("kvm: x86: Set highest physical address bits in non-present/reserved SPTEs") > Cc: Junaid Shahid <junaids@xxxxxxxxxx> > Cc: Jim Mattson <jmattson@xxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> > --- > arch/x86/kvm/mmu.c | 23 +++++++++++++++++++---- > 1 file changed, 19 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index a282321329b5..be5cb753fb42 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -249,6 +249,17 @@ static u64 __read_mostly shadow_nonpresent_or_rsvd_mask; > */ > static const u64 shadow_nonpresent_or_rsvd_mask_len = 5; > > +/* > + * In some cases, we need to preserve the GFN of a non-present or reserved > + * SPTE when we usurp the upper five bits of the physical address space to > + * defend against L1TF, e.g. for MMIO SPTEs. To preserve the GFN, we'll > + * shift bits of the GFN that overlap with shadow_nonpresent_or_rsvd_mask > + * left into the reserved bits, i.e. the GFN in the SPTE will be split into > + * high and low parts. This mask covers the lower bits of the GFN. > + */ > +static u64 __read_mostly shadow_nonpresent_or_rsvd_lower_gpa_mask; > + > + > static void mmu_spte_set(u64 *sptep, u64 spte); > static union kvm_mmu_page_role > kvm_mmu_calc_root_page_role(struct kvm_vcpu *vcpu); > @@ -357,9 +368,7 @@ static bool is_mmio_spte(u64 spte) > > static gfn_t get_mmio_spte_gfn(u64 spte) > { > - u64 mask = generation_mmio_spte_mask(MMIO_GEN_MASK) | shadow_mmio_mask | > - shadow_nonpresent_or_rsvd_mask; > - u64 gpa = spte & ~mask; > + u64 gpa = spte & shadow_nonpresent_or_rsvd_lower_gpa_mask; > > gpa |= (spte >> shadow_nonpresent_or_rsvd_mask_len) > & shadow_nonpresent_or_rsvd_mask; > @@ -423,6 +432,8 @@ EXPORT_SYMBOL_GPL(kvm_mmu_set_mask_ptes); > > static void kvm_mmu_reset_all_pte_masks(void) > { > + u8 low_gpa_bits; > + > shadow_user_mask = 0; > shadow_accessed_mask = 0; > shadow_dirty_mask = 0; > @@ -437,12 +448,16 @@ static void kvm_mmu_reset_all_pte_masks(void) > * appropriate mask to guard against L1TF attacks. Otherwise, it is > * assumed that the CPU is not vulnerable to L1TF. > */ > + low_gpa_bits = boot_cpu_data.x86_phys_bits; > if (boot_cpu_data.x86_phys_bits < > - 52 - shadow_nonpresent_or_rsvd_mask_len) > + 52 - shadow_nonpresent_or_rsvd_mask_len) { > shadow_nonpresent_or_rsvd_mask = > rsvd_bits(boot_cpu_data.x86_phys_bits - > shadow_nonpresent_or_rsvd_mask_len, > boot_cpu_data.x86_phys_bits - 1); > + low_gpa_bits -= shadow_nonpresent_or_rsvd_mask_len; > + } > + shadow_nonpresent_or_rsvd_lower_gpa_mask = (1ULL << low_gpa_bits) - 1; I think that it might be slightly better to do something like: + shadow_nonpresent_or_rsvd_lower_gpa_mask = rsvd_bits(PAGE_SHIFT, low_gpa_bits - 1); Of course, it doesn't matter for get_mmio_spte_gfn() because that already shifts by PAGE_SHIFT, but could matter if this were to get used somewhere else. > } > > static int is_cpuid_PSE36(void) > Reviewed-by: Junaid Shahid <junaids@xxxxxxxxxx>