Re: [PATCH -v2] kvm: Emulate MOVBE

Gleb Natapov <gleb@xxxxxxxxxx> · Sun, 14 Apr 2013 11:43:03 +0300

On Thu, Apr 11, 2013 at 02:18:15AM +0200, Borislav Petkov wrote:
> On Wed, Apr 10, 2013 at 03:16:39PM +0300, Gleb Natapov wrote:
> > Right, the question is how kernel can tell QEMU that the cpuid bit is
> > supported but should not be set unless explicitly asked by an user.
> 
> Actually, this seems to work with the patch below based on whether you
> have "+movbe" in the -cpu option or not.
> 
> Anyway, here's the second version with hopefully all comments and
> suggestions addressed.
> 
> Thanks.
> 
> --
> >From 612fc75a732ad16332f270b7c52a68c89e3565ca Mon Sep 17 00:00:00 2001
> From: Borislav Petkov <bp@xxxxxxx>
> Date: Thu, 11 Apr 2013 02:06:30 +0200
> Subject: [PATCH] kvm: Emulate MOVBE
> 
> This basically came from the need to be able to boot 32-bit Atom SMP
> guests on an AMD host, i.e. host which doesn't support MOVBE. As a
> matter of fact, qemu has since recently received MOVBE support but we
> cannot share that with kvm emulation and thus we have to do this in the
> host.
> 
> We piggyback on the #UD path and emulate the MOVBE functionality. With
> it, an 8-core SMP guest boots in under 6 seconds.
> 
> Also, requesting MOVBE emulation needs to happen explicitly to work,
> i.e. qemu -cpu n270,+movbe...
> 
> Signed-off-by: Andre Przywara <andre@xxxxxxxxx>
> Signed-off-by: Borislav Petkov <bp@xxxxxxx>
> ---
>  arch/x86/kvm/cpuid.c   |  2 +-
>  arch/x86/kvm/emulate.c | 39 +++++++++++++++++++++++++++++++++++++--
>  2 files changed, 38 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index a20ecb5b6cbf..2d44fc4fd855 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -273,7 +273,7 @@ static int do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  		cpuid_mask(&entry->ecx, 4);
>  		/* we support x2apic emulation even if host does not support
>  		 * it since we emulate x2apic in software */
> -		entry->ecx |= F(X2APIC);
> +		entry->ecx |= F(X2APIC) | F(MOVBE);
>  		break;
>  	/* function 2 entries are STATEFUL. That is, repeated cpuid commands
>  	 * may return different values. This forces us to get_cpu() before
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index a335cc6cde72..9011c7a656ad 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -152,6 +152,7 @@
>  #define Avx         ((u64)1 << 43)  /* Advanced Vector Extensions */
>  #define Fastop      ((u64)1 << 44)  /* Use opcode::u.fastop */
>  #define NoWrite     ((u64)1 << 45)  /* No writeback */
> +#define EmulateOnUD ((u64)1 << 46)  /* emulate if unsupported by the host */
>  
Just rename VendorSpecific to EmulateOnUD. The meaning is the same.

>  #define X2(x...) x, x
>  #define X3(x...) X2(x), x
> @@ -3107,6 +3108,30 @@ static int em_mov(struct x86_emulate_ctxt *ctxt)
>  	return X86EMUL_CONTINUE;
>  }
>  
> +static int em_movbe(struct x86_emulate_ctxt *ctxt)
> +{
> +	switch (ctxt->op_bytes) {
> +	case 2:
> +		*(u16 *)ctxt->dst.valptr = swab16(*(u16 *)ctxt->src.valptr);
What's wrong with: ctxt->dst.val = swab16((u16)ctxt->src.val)?

> +		break;
> +	case 4:
> +		*(u32 *)ctxt->dst.valptr = swab32(*(u32 *)ctxt->src.valptr);
> +
> +		/*
> +		 * clear upper dword for 32-bit operand size in 64-bit mode.
> +		 */
> +		if (ctxt->mode == X86EMUL_MODE_PROT64)
> +			*((u32 *)ctxt->dst.valptr + 1) = 0x0;
It should be clean without this this.

> +		break;
> +	case 8:
> +		*(u64 *)ctxt->dst.valptr = swab64(*(u64 *)ctxt->src.valptr);
> +		break;
> +	default:
> +		return X86EMUL_PROPAGATE_FAULT;
> +	}
> +	return X86EMUL_CONTINUE;
> +}
> +
>  static int em_cr_write(struct x86_emulate_ctxt *ctxt)
>  {
>  	if (ctxt->ops->set_cr(ctxt, ctxt->modrm_reg, ctxt->src.val))
> @@ -4033,6 +4058,11 @@ static const struct opcode twobyte_table[256] = {
>  	N, N, N, N, N, N, N, N, N, N, N, N, N, N, N, N
>  };
>  
> +static const struct opcode threebyte_table[] = {
> +	[0xf0] = I(DstReg | SrcMem | ModRM | Mov | EmulateOnUD, em_movbe),
> +	[0xf1] = I(DstMem | SrcReg | ModRM | Mov | EmulateOnUD, em_movbe),
> +};
> +
Populate the whole table explicitly please. You can use X16(N) to make it
short. The encoding is also wrong since 0F 38 F0/F1 instruction decoding
depend on a prefix. Same opcode is decoded as CRC32 with f2 prefix. We
have Prefix mechanism for such instructions, but it currently assumes
that 66 and f2/f3 prefixes are mutually exclusive, but in this case they
are not, ugh.

>  #undef D
>  #undef N
>  #undef G
> @@ -4320,6 +4350,9 @@ done_prefixes:
>  		ctxt->twobyte = 1;
>  		ctxt->b = insn_fetch(u8, ctxt);
>  		opcode = twobyte_table[ctxt->b];
> +
> +		if (ctxt->b == 0x38)
> +			opcode = threebyte_table[insn_fetch(u8, ctxt)];
ctxt->b = insn_fetch(u8, ctxt);
opcode = threebyte_table[ctxt->b];

Otherwise OpImplicit type of decoding will not work for three byte
instructions. Also should rename twobyte into opbytes and set it to 1,
2 or 3. I prefer that three byte instruction decoding will be sent as a
separate patch.

>  	}
>  	ctxt->d = opcode.flags;
>  
> @@ -4376,8 +4409,10 @@ done_prefixes:
>  	if (ctxt->d == 0 || (ctxt->d & Undefined))
>  		return EMULATION_FAILED;
>  
> -	if (!(ctxt->d & VendorSpecific) && ctxt->only_vendor_specific_insn)
> -		return EMULATION_FAILED;
> +	if (!(ctxt->d & VendorSpecific) && ctxt->only_vendor_specific_insn) {
> +		if (!(ctxt->d & EmulateOnUD))
> +			return EMULATION_FAILED;
> +	}
>  
>  	if (mode == X86EMUL_MODE_PROT64 && (ctxt->d & Stack))
>  		ctxt->op_bytes = 8;
> -- 
> 1.8.2.135.g7b592fa
> 
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> Sent from a fat crate under my desk. Formatting is fine.
> --

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html