On Thu, Apr 11, 2013 at 02:18:15AM +0200, Borislav Petkov wrote: > On Wed, Apr 10, 2013 at 03:16:39PM +0300, Gleb Natapov wrote: > > Right, the question is how kernel can tell QEMU that the cpuid bit is > > supported but should not be set unless explicitly asked by an user. > > Actually, this seems to work with the patch below based on whether you > have "+movbe" in the -cpu option or not. > > Anyway, here's the second version with hopefully all comments and > suggestions addressed. > > Thanks. > > -- > >From 612fc75a732ad16332f270b7c52a68c89e3565ca Mon Sep 17 00:00:00 2001 > From: Borislav Petkov <bp@xxxxxxx> > Date: Thu, 11 Apr 2013 02:06:30 +0200 > Subject: [PATCH] kvm: Emulate MOVBE > > This basically came from the need to be able to boot 32-bit Atom SMP > guests on an AMD host, i.e. host which doesn't support MOVBE. As a > matter of fact, qemu has since recently received MOVBE support but we > cannot share that with kvm emulation and thus we have to do this in the > host. > > We piggyback on the #UD path and emulate the MOVBE functionality. With > it, an 8-core SMP guest boots in under 6 seconds. > > Also, requesting MOVBE emulation needs to happen explicitly to work, > i.e. qemu -cpu n270,+movbe... > > Signed-off-by: Andre Przywara <andre@xxxxxxxxx> > Signed-off-by: Borislav Petkov <bp@xxxxxxx> > --- > arch/x86/kvm/cpuid.c | 2 +- > arch/x86/kvm/emulate.c | 39 +++++++++++++++++++++++++++++++++++++-- > 2 files changed, 38 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c > index a20ecb5b6cbf..2d44fc4fd855 100644 > --- a/arch/x86/kvm/cpuid.c > +++ b/arch/x86/kvm/cpuid.c > @@ -273,7 +273,7 @@ static int do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, > cpuid_mask(&entry->ecx, 4); > /* we support x2apic emulation even if host does not support > * it since we emulate x2apic in software */ > - entry->ecx |= F(X2APIC); > + entry->ecx |= F(X2APIC) | F(MOVBE); > break; > /* function 2 entries are STATEFUL. That is, repeated cpuid commands > * may return different values. This forces us to get_cpu() before > diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c > index a335cc6cde72..9011c7a656ad 100644 > --- a/arch/x86/kvm/emulate.c > +++ b/arch/x86/kvm/emulate.c > @@ -152,6 +152,7 @@ > #define Avx ((u64)1 << 43) /* Advanced Vector Extensions */ > #define Fastop ((u64)1 << 44) /* Use opcode::u.fastop */ > #define NoWrite ((u64)1 << 45) /* No writeback */ > +#define EmulateOnUD ((u64)1 << 46) /* emulate if unsupported by the host */ > Just rename VendorSpecific to EmulateOnUD. The meaning is the same. > #define X2(x...) x, x > #define X3(x...) X2(x), x > @@ -3107,6 +3108,30 @@ static int em_mov(struct x86_emulate_ctxt *ctxt) > return X86EMUL_CONTINUE; > } > > +static int em_movbe(struct x86_emulate_ctxt *ctxt) > +{ > + switch (ctxt->op_bytes) { > + case 2: > + *(u16 *)ctxt->dst.valptr = swab16(*(u16 *)ctxt->src.valptr); What's wrong with: ctxt->dst.val = swab16((u16)ctxt->src.val)? > + break; > + case 4: > + *(u32 *)ctxt->dst.valptr = swab32(*(u32 *)ctxt->src.valptr); > + > + /* > + * clear upper dword for 32-bit operand size in 64-bit mode. > + */ > + if (ctxt->mode == X86EMUL_MODE_PROT64) > + *((u32 *)ctxt->dst.valptr + 1) = 0x0; It should be clean without this this. > + break; > + case 8: > + *(u64 *)ctxt->dst.valptr = swab64(*(u64 *)ctxt->src.valptr); > + break; > + default: > + return X86EMUL_PROPAGATE_FAULT; > + } > + return X86EMUL_CONTINUE; > +} > + > static int em_cr_write(struct x86_emulate_ctxt *ctxt) > { > if (ctxt->ops->set_cr(ctxt, ctxt->modrm_reg, ctxt->src.val)) > @@ -4033,6 +4058,11 @@ static const struct opcode twobyte_table[256] = { > N, N, N, N, N, N, N, N, N, N, N, N, N, N, N, N > }; > > +static const struct opcode threebyte_table[] = { > + [0xf0] = I(DstReg | SrcMem | ModRM | Mov | EmulateOnUD, em_movbe), > + [0xf1] = I(DstMem | SrcReg | ModRM | Mov | EmulateOnUD, em_movbe), > +}; > + Populate the whole table explicitly please. You can use X16(N) to make it short. The encoding is also wrong since 0F 38 F0/F1 instruction decoding depend on a prefix. Same opcode is decoded as CRC32 with f2 prefix. We have Prefix mechanism for such instructions, but it currently assumes that 66 and f2/f3 prefixes are mutually exclusive, but in this case they are not, ugh. > #undef D > #undef N > #undef G > @@ -4320,6 +4350,9 @@ done_prefixes: > ctxt->twobyte = 1; > ctxt->b = insn_fetch(u8, ctxt); > opcode = twobyte_table[ctxt->b]; > + > + if (ctxt->b == 0x38) > + opcode = threebyte_table[insn_fetch(u8, ctxt)]; ctxt->b = insn_fetch(u8, ctxt); opcode = threebyte_table[ctxt->b]; Otherwise OpImplicit type of decoding will not work for three byte instructions. Also should rename twobyte into opbytes and set it to 1, 2 or 3. I prefer that three byte instruction decoding will be sent as a separate patch. > } > ctxt->d = opcode.flags; > > @@ -4376,8 +4409,10 @@ done_prefixes: > if (ctxt->d == 0 || (ctxt->d & Undefined)) > return EMULATION_FAILED; > > - if (!(ctxt->d & VendorSpecific) && ctxt->only_vendor_specific_insn) > - return EMULATION_FAILED; > + if (!(ctxt->d & VendorSpecific) && ctxt->only_vendor_specific_insn) { > + if (!(ctxt->d & EmulateOnUD)) > + return EMULATION_FAILED; > + } > > if (mode == X86EMUL_MODE_PROT64 && (ctxt->d & Stack)) > ctxt->op_bytes = 8; > -- > 1.8.2.135.g7b592fa > > > -- > Regards/Gruss, > Boris. > > Sent from a fat crate under my desk. Formatting is fine. > -- -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html