Re: [PATCH v2] KVM: SVM: Override default MMIO mask if memory encryption is enabled

Tom Lendacky <thomas.lendacky@xxxxxxx> · Tue, 7 Jan 2020 17:51:51 -0600

On 1/7/20 5:31 PM, Sean Christopherson wrote:
> On Tue, Jan 07, 2020 at 04:54:34PM -0600, Tom Lendacky wrote:
>> On 1/7/20 4:28 PM, Sean Christopherson wrote:
>>> On Tue, Jan 07, 2020 at 02:16:37PM -0600, Tom Lendacky wrote:
>>>> On 1/6/20 5:38 PM, Sean Christopherson wrote:
>>>>> On Mon, Jan 06, 2020 at 05:14:04PM -0600, Tom Lendacky wrote:
>>>>>> On 1/6/20 4:49 PM, Sean Christopherson wrote:
>>>>>>> This doesn't handle the case where x86_phys_bits _isn't_ reduced by SME/SEV
>>>>>>> on a future processor, i.e. x86_phys_bits==52.
>>>>>>
>>>>>> Not sure I follow. If MSR_K8_SYSCFG_MEM_ENCRYPT is set then there will
>>>>>> always be a reduction in physical addressing (so I'm told).
>>>>>
>>>>> Hmm, I'm going off APM Vol 2, which states, or at least strongly implies,
>>>>> that reducing the PA space is optional.  Section 7.10.2 is especially
>>>>> clear on this:
>>>>>
>>>>>   In implementations where the physical address size of the processor is
>>>>>   reduced when memory encryption features are enabled, software must
>>>>>   ensure it is executing from addresses where these upper physical address
>>>>>   bits are 0 prior to setting SYSCFG[MemEncryptionModEn].
>>>>
>>>> It's probably not likely, but given what is stated, I can modify my patch
>>>> to check for a x86_phys_bits == 52 and skip the call to set the mask, eg:
>>>>
>>>> 	if (msr & MSR_K8_SYSCFG_MEM_ENCRYPT &&
>>>> 	    boot_cpu_data.x86_phys_bits < 52) {
>>>>
>>>>>
>>>>> But, hopefully the other approach I have in mind actually works, as it's
>>>>> significantly less special-case code and would naturally handle either
>>>>> case, i.e. make this a moot point.
>>>>
>>>> I'll hold off on the above and wait for your patch.
>>>
>>> Sorry for the delay, this is a bigger mess than originally thought.  Or
>>> I'm completely misunderstanding the issue, which is also a distinct
>>> possibility :-)
>>>
>>> Due to KVM activating its L1TF mitigation irrespective of whether the CPU
>>> is whitelisted as not being vulnerable to L1TF, simply using 86_phys_bits
>>> to avoid colliding with the C-bit isn't sufficient as the L1TF mitigation
>>> uses those first five reserved PA bits to store the MMIO GFN.  Setting
>>> BIT(x86_phys_bits) for all MMIO sptes would cause it to be interpreted as
>>> a GFN bit when the L1TF mitigation is active and lead to bogus MMIO.
>>
>> The L1TF mitigation only gets applied when:
>>   boot_cpu_data.x86_cache_bits < 52 - shadow_nonpresent_or_rsvd_mask_len
>>
>>   and with shadow_nonpresent_or_rsvd_mask_len = 5, that means that means
>>   boot_cpu_data.x86_cache_bits < 47.
>>
>> On AMD processors that support memory encryption, the x86_cache_bits value
>> is not adjusted, just the x86_phys_bits. So for AMD processors that have
>> memory encryption support, this value will be at least 48 and therefore
>> not activate the L1TF mitigation.
> 
> Ah.  Hrm.  I'd prefer to clean that code up to make the interactions more
> explicit, but may be we can separate that out.
> 
>>> The only sane approach I can think of is to activate the L1TF mitigation
>>> based on whether the CPU is vulnerable to L1TF, as opposed to activating> the mitigation purely based on the max PA of the CPU.  Since all CPUs that
>>> support SME/SEV are whitelisted as NO_L1TF, the L1TF mitigation and C-bit
>>> should never be active at the same time.
>>
>> There is still the issue of setting a single bit that can conflict with
>> the C-bit. As it is today, if the C-bit were to be defined as bit 51, then
>> KVM would not take a nested page fault and MMIO would be broken.
> 
> Wouldn't Paolo's patch to use the raw "cpuid_eax(0x80000008) & 0xff" for
> shadow_phys_bits fix that particular collision by causing
> kvm_set_mmio_spte_mask() to clear the present bit?  Or am I misundertanding
> how the PA reduction interacts with the C-Bit?
> 
> AIUI, using phys_bits=48, then the standard scenario is Cbit=47 and some
> additional bits 46:M are reserved.  Applying that logic to phys_bits=52,
> then Cbit=51 and bits 50:M are reserved, so there's a collision but it's

There's no requirement that the C-bit correspond to phys_bits. So, for
example, you can have C-bit=51 and phys_bits=48 and so 47:M are reserved.

Thanks,
Tom

> mostly benign because shadow_phys_bits==52, which triggers this:
> 
> 	if (IS_ENABLED(CONFIG_X86_64) && shadow_phys_bits == 52)
> 		mask &= ~1ull;
> 
> In other words, Paolo's patch fixes the fatal bug, but unnecessarily
> disables optimized MMIO page faults.  To remedy that, your idea is to rely
> on the (undocumented?) behavior that there are always additional reserved
> bits between Cbit and the reduced x86_phys_bits.
>