Re: [PATCH v1 3/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Similar to comments on an earlier patch, don't try to describe the literal code
change, e.g. this does far more than just "Add new ops in kvm_x86_ops".  Something
like

  KVM: VMX: Add emulation of LASS violation checks on linear address 

On Thu, Jun 01, 2023, Zeng Guang wrote:
> Intel introduces LASS (Linear Address Separation) feature providing
> an independent mechanism to achieve the mode-based protection.
> 
> LASS partitions 64-bit linear address space into two halves, user-mode
> address (LA[bit 63]=0) and supervisor-mode address (LA[bit 63]=1). It
> stops any code execution or conditional data access[1]
>     1. from user mode to supervisor-mode address space
>     2. from supervisor mode to user-mode address space
> and generates LASS violation fault accordingly.
> 
> [1]A supervisor mode data access causes a LASS violation only if supervisor
> mode access protection is enabled (CR4.SMAP = 1) and either RFLAGS.AC = 0
> or the access implicitly accesses a system data structure.
> 
> Following are the rules of LASS violation check on the linear address(LA).
> User access to supervisor-mode address space:
>     LA[bit 63] && (CPL == 3)
> Supervisor access to user-mode address space:
>     Instruction fetch: !LA[bit 63] && (CPL < 3)
>     Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
>                  CPL < 3) || Implicit supervisor access)
> 
> Add new ops in kvm_x86_ops to do LASS violation check.
> 
> Signed-off-by: Zeng Guang <guang.zeng@xxxxxxxxx>
> Tested-by: Xuelian Guo <xuelian.guo@xxxxxxxxx>
> ---
>  arch/x86/include/asm/kvm-x86-ops.h |  3 +-
>  arch/x86/include/asm/kvm_host.h    |  2 ++
>  arch/x86/kvm/kvm_emulate.h         |  1 +
>  arch/x86/kvm/vmx/vmx.c             | 47 ++++++++++++++++++++++++++++++
>  arch/x86/kvm/vmx/vmx.h             |  2 ++
>  5 files changed, 54 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
> index 13bc212cd4bc..8980a3bfa687 100644
> --- a/arch/x86/include/asm/kvm-x86-ops.h
> +++ b/arch/x86/include/asm/kvm-x86-ops.h
> @@ -132,7 +132,8 @@ KVM_X86_OP_OPTIONAL(migrate_timers)
>  KVM_X86_OP(msr_filter_changed)
>  KVM_X86_OP(complete_emulated_msr)
>  KVM_X86_OP(vcpu_deliver_sipi_vector)
> -KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
> +KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons)
> +KVM_X86_OP_OPTIONAL_RET0(check_lass)

Use "is_lass_violation" instead of "check_lass" for all function names.  "check"
doesn't convey that the function is a predicate, i.e. that it returns true/false.

>  #undef KVM_X86_OP
>  #undef KVM_X86_OP_OPTIONAL
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 92d8e65fe88c..98666d1e7727 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1731,6 +1731,8 @@ struct kvm_x86_ops {
>  	 * Returns vCPU specific APICv inhibit reasons
>  	 */
>  	unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
> +
> +	bool (*check_lass)(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags);
>  };
>  
>  struct kvm_x86_nested_ops {
> diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
> index 5b9ec610b2cb..f1439ab7c14b 100644
> --- a/arch/x86/kvm/kvm_emulate.h
> +++ b/arch/x86/kvm/kvm_emulate.h
> @@ -91,6 +91,7 @@ struct x86_instruction_info {
>  /* x86-specific emulation flags */
>  #define X86EMUL_F_FETCH			BIT(0)
>  #define X86EMUL_F_WRITE			BIT(1)
> +#define X86EMUL_F_SKIPLASS		BIT(2)

This belongs in the patch that integrates LASS into the emulator.  And rather than
make the flag a command (SKIPLASS), I think it makes sense to make the flag describe
the access.  It'll mean more flags, but those are free.  That way the originators of
the accesses, e.g. em_invlpg(), don't need to document how they interact with LASS,
e.g. this code is self-documenting, and if someone wants to understand why KVM
has a dedicated X86EMUL_F_INVLPG flag, it's easy enough to find the consumer.
And in the unlikely scenario that other things care about INVLPG, branch targets,
etc., we won't end up with X86EMUL_F_SKIPLASS, X86EMUL_F_SKIPOTHER, etc.

	rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1,
			 X86EMUL_F_INVLPG, ctxt->mode, &linear);

So this?

  #define X86EMUL_F_IMPLICIT
  #define X86EMUL_F_INVLPG
  #define X86EMUL_F_BRANCH_TARGET

>  struct x86_emulate_ops {
>  	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index a33205ded85c..876997e8448e 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -8130,6 +8130,51 @@ static void vmx_vm_destroy(struct kvm *kvm)
>  	free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
>  }
>  
> +/*
> + * Determine whether an access to the linear address causes a LASS violation.
> + * LASS protection is only effective in long mode. As a prerequisite, caller
> + * should make sure vCPU running in long mode and invoke this api to do LASS
> + * violation check.
> + */
> +bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags)
> +{
> +	bool user_mode, user_as, rflags_ac;
> +
> +	if (!!(flags & X86EMUL_F_SKIPLASS) ||
> +	    !kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS))
> +		return false;
> +
> +	WARN_ON_ONCE(!is_long_mode(vcpu));

This is silly.  By making this a preqreq, 2 of the 3 callers are forced to explicitly
check for is_long_mode(), *and* it unnecessarily bleeds LASS details outside of VMX.

> +
> +	user_as = !(la >> 63);
> +
> +	/*
> +	 * An access is a supervisor-mode access if CPL < 3 or if it implicitly
> +	 * accesses a system data structure. For implicit accesses to system
> +	 * data structure, the processor acts as if RFLAGS.AC is clear.
> +	 */
> +	if (access & PFERR_IMPLICIT_ACCESS) {

Please don't use PFERR_IMPLICIT_ACCESS, just extend the new flags.  This is
obviously not coming from a page fault.  PFERR_IMPLICIT_ACCESS really shouldn't
exist, but at least there was reasonable justification for adding it (changing
all of the paths that lead to permission_fault() would have require a massive
overhaul).

***HOWEVER***

I think the entire approach of hooking __linearize() may be a mistake, and LASS
should instead be implemented in a wrapper of ->gva_to_gpa().  The two calls to
__linearize() that are escaped with SKIPLASS are escaped *because* they don't
actually access memory (branch targets and INVLPG), and so if LASS is enforced
only when ->gva_to_gpa() is invoked, all of these new flags go away because the
cases that ignore LASS are naturally handled.

That should also make it unnecessary to add one-off checks since e.g. kvm_handle_invpcid()
will hit kvm_read_guest_virt() and thus ->gva_to_gpa(), i.e. we won't end up playing
an ongoing game of whack-a-mole.

And one question that needs to be answered is what happens when an access rolls
over from supervisor to user, e.g. if the kernel access 8 bytes at -1ull and thus
reads all Fs => 0x6, does the access get a LASS violation on the access to user
memory.  User=>supervisor can't happen because non-canonical checks have higher
priority, but supervisor=>user can.  And that matters because it will impact
whether or not KVM needs to check each *page* or just the initial address.

> +		user_mode = false;
> +		rflags_ac = false;
> +	} else {
> +		user_mode = vmx_get_cpl(vcpu) == 3;
> +		if (!user_mode)
> +			rflags_ac = !!(kvm_get_rflags(vcpu) & X86_EFLAGS_AC);
> +	}
> +
> +	if (user_mode == user_as)
> +		return false;
> +
> +	/*
> +	 * Supervisor-mode _data_ accesses to user address space
> +	 * cause LASS violations only if SMAP is enabled.
> +	 */
> +	if (!user_mode && !(access & PFERR_FETCH_MASK))
> +		return kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP) && !rflags_ac;

This is all more complex than it needs to be.  Using local bools just so that
the "user_mode == user_as" is common is not a good tradeoff.  User addresses have
*significantly* different behavior, lean into that instead of dancing around it.

bool is_lass_violation(struct kvm_vcpu *vcpu, unsigned long addr,
		       unsigned int flags)
{
	const bool is_supervisor_access = addr & BIT_ULL(63);
	const bool implicit = flags & X86EMUL_F_IMPLICIT;

	bool user_mode, user_as, rflags_ac;

	if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS) || !is_long_mode(vcpu))
		return false;

	/*
	 * INVLPG isn't subject to LASS, e.g. to allow invalidating userspace
	 * addresses without toggling RFLAGS.AC.  Branch targets aren't subject
	 * to LASS in order to simplifiy far control transfers (the subsequent
	 * fetch will enforce LASS as appropriate).
	 */
	if (flags & (X86EMUL_F_BRANCH_TARGET | X86EMUL_F_INVLPG))
		return false;

	if (!implicit && vmx_get_cpl(vcpu) == 3)
		return is_supervisor_address;

	/* LASS is enforced for supervisor access iff SMAP is enabled. */
	if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP))
		return false;

	/* Like SMAP, RFLAGS.AC disables LASS checks in supervisor mode. */
	if (!implicit && (kvm_get_rflags(vcpu) & X86_EFLAGS_AC))
		return false;

	return !is_supervisor_address;
}

> +	return true;
> +}



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux