Re: [PATCH 5/6] KVM: VMX: Always intercept accesses to unsupported "extended" x2APIC regs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 2023-01-07 at 01:10 +0000, Sean Christopherson wrote:
> Don't clear the "read" bits for x2APIC registers above SELF_IPI (APIC regs
> 0x400 - 0xff0, MSRs 0x840 - 0x8ff).  KVM doesn't emulate registers in that
> space (there are a smattering of AMD-only extensions) and so should
> intercept reads in order to inject #GP.  When APICv is fully enabled,
> Intel hardware doesn't validate the registers on RDMSR and instead blindly
> retrieves data from the vAPIC page, i.e. it's software's responsibility to
> intercept reads to non-existent MSRs.
> 
> Fixes: 8d14695f9542 ("x86, apicv: add virtual x2apic support")
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
>  arch/x86/kvm/vmx/vmx.c | 38 ++++++++++++++++++++------------------
>  1 file changed, 20 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index c788aa382611..82c61c16f8f5 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4018,26 +4018,17 @@ void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type)
>  		vmx_set_msr_bitmap_write(msr_bitmap, msr);
>  }
>  
> -static void vmx_reset_x2apic_msrs(struct kvm_vcpu *vcpu, u8 mode)
> -{
> -	unsigned long *msr_bitmap = to_vmx(vcpu)->vmcs01.msr_bitmap;
> -	unsigned long read_intercept;
> -	int msr;
> -
> -	read_intercept = (mode & MSR_BITMAP_MODE_X2APIC_APICV) ? 0 : ~0;
> -
> -	for (msr = 0x800; msr <= 0x8ff; msr += BITS_PER_LONG) {
> -		unsigned int read_idx = msr / BITS_PER_LONG;
> -		unsigned int write_idx = read_idx + (0x800 / sizeof(long));
> -
> -		msr_bitmap[read_idx] = read_intercept;
> -		msr_bitmap[write_idx] = ~0ul;
> -	}
> -}
> -
>  static void vmx_update_msr_bitmap_x2apic(struct kvm_vcpu *vcpu)
>  {
> +	/*
> +	 * x2APIC indices for 64-bit accesses into the RDMSR and WRMSR halves
> +	 * of the MSR bitmap.  KVM emulates APIC registers up through 0x3f0,
> +	 * i.e. MSR 0x83f, and so only needs to dynamically manipulate 64 bits.
> +	 */
The above comment is better to be placed down below, near the actual write,
otherwise it is confusing.

> +	const int read_idx = APIC_BASE_MSR / BITS_PER_LONG_LONG;
> +	const int write_idx = read_idx + (0x800 / sizeof(u64));
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
> +	u64 *msr_bitmap = (u64 *)vmx->vmcs01.msr_bitmap;
>  	u8 mode;
>  
>  	if (!cpu_has_vmx_msr_bitmap())
> @@ -4058,7 +4049,18 @@ static void vmx_update_msr_bitmap_x2apic(struct kvm_vcpu *vcpu)
>  
>  	vmx->x2apic_msr_bitmap_mode = mode;
>  
> -	vmx_reset_x2apic_msrs(vcpu, mode);
> +	/*
> +	 * Reset the bitmap for MSRs 0x800 - 0x83f.  Leave AMD's uber-extended
> +	 * registers (0x840 and above) intercepted, KVM doesn't support them.

I don't think AMD calls them uber-extended. Just extended.

>From a quick glance, these could have beeing very useful for VFIO passthrough of INT-X interrupts,
removing the need to mask the interrupt on per PCI device basis - instead you can just leave
the IRQ pending in ISR, while using SEOI and IER to ignore this pending bit for host.

I understand that the days of INT-X are long gone (and especially days of shared IRQ lines...)
and every sane device uses MSI/-X instead, but still.


> +	 * Intercept all writes by default and poke holes as needed.  Pass
> +	 * through all reads by default in x2APIC+APICv mode, as all registers
> +	 * except the current timer count are passed through for read.
> +	 */
> +	if (mode & MSR_BITMAP_MODE_X2APIC_APICV)
> +		msr_bitmap[read_idx] = 0;
> +	else
> +		msr_bitmap[read_idx] = ~0ull;
> +	msr_bitmap[write_idx] = ~0ull;
>  
>  	/*
>  	 * TPR reads and writes can be virtualized even if virtual interrupt

Other than the note about the comment,

Reviewed-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx>


Best regards,
	Maxim Levitsky




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux