Re: [PATCH 2/2] KVM: fix missing "illegal instruction"-trap in protected modes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 11, 2012 at 11:19:50PM +0100, Stephan Bärwolf wrote:
> On 01/11/12 22:21, Marcelo Tosatti wrote:
> > On Wed, Jan 11, 2012 at 09:01:10PM +0100, Stephan Bärwolf wrote:
> >> On 01/11/12 20:09, Marcelo Tosatti wrote:
> >>> On Tue, Jan 10, 2012 at 03:26:49PM +0100, Stephan Bärwolf wrote:
> >>>> >From 2168285ffb30716f30e129c3ce98ce42d19c4d4e Mon Sep 17 00:00:00 2001
> >>>> From: Stephan Baerwolf <stephan.baerwolf@xxxxxxxxxxxxx>
> >>>> Date: Sun, 8 Jan 2012 02:03:47 +0000
> >>>> Subject: [PATCH 2/2] KVM: fix missing "illegal instruction"-trap in
> >>>> protected modes
> >>>>
> >>>> On hosts without this patch, 32bit guests will crash
> >>>> (and 64bit guests may behave in a wrong way) for
> >>>> example by simply executing following nasm-demo-application:
> >>>>
> >>>>     [bits 32]
> >>>>     global _start
> >>>>     SECTION .text
> >>>>     _start: syscall
> >>>>
> >>>> (I tested it with winxp and linux - both always crashed)
> >>>>
> >>>>     Disassembly of section .text:
> >>>>
> >>>>     00000000 <_start>:
> >>>>        0:   0f 05                   syscall
> >>>>
> >>>> The reason seems a missing "invalid opcode"-trap (int6) for the
> >>>> syscall opcode "0f05", which is not available on Intel CPUs
> >>>> within non-longmodes, as also on some AMD CPUs within legacy-mode.
> >>>> (depending on CPU vendor, MSR_EFER and cpuid)
> >>>>
> >>>> Because previous mentioned OSs may not engage corresponding
> >>>> syscall target-registers (STAR, LSTAR, CSTAR), they remain
> >>>> NULL and (non trapping) syscalls are leading to multiple
> >>>> faults and finally crashs.
> >>>>
> >>>> Depending on the architecture (AMD or Intel) pretended by
> >>>> guests, various checks according to vendor's documentation
> >>>> are implemented to overcome the current issue and behave
> >>>> like the CPUs physical counterparts.
> >>>>
> >>>> (Therefore using Intel's "Intel 64 and IA-32 Architecture Software
> >>>> Developers Manual" http://www.intel.com/content/dam/doc/manual/
> >>>> 64-ia-32-architectures-software-developer-manual-325462.pdf
> >>>> and AMD's "AMD64 Architecture Programmer's Manual Volume 3:
> >>>> General-Purpose and System Instructions"
> >>>> http://support.amd.com/us/Processor_TechDocs/APM_V3_24594.pdf )
> >>>>
> >>>> Screenshots of an i686 testing VM (CORE i5 host) before
> >>>> and after applying this patch are available under:
> >>>>
> >>>> http://matrixstorm.com/software/linux/kvm/20111229/before.jpg
> >>>> http://matrixstorm.com/software/linux/kvm/20111229/after.jpg
> >>>>
> >>>> Signed-off-by: Stephan Baerwolf <stephan.baerwolf@xxxxxxxxxxxxx>
> >>>> ---
> >>>>  arch/x86/include/asm/kvm_emulate.h |   15 ++++++
> >>>>  arch/x86/kvm/emulate.c             |   92
> >>>> ++++++++++++++++++++++++++++++++++-
> >>>>  2 files changed, 104 insertions(+), 3 deletions(-)
> >>>>
> >>>> diff --git a/arch/x86/include/asm/kvm_emulate.h
> >>>> b/arch/x86/include/asm/kvm_emulate.h
> >>>> index b172bf4..5b68c23 100644
> >>>> --- a/arch/x86/include/asm/kvm_emulate.h
> >>>> +++ b/arch/x86/include/asm/kvm_emulate.h
> >>>> @@ -301,6 +301,21 @@ struct x86_emulate_ctxt {
> >>>>  #define X86EMUL_MODE_PROT     (X86EMUL_MODE_PROT16|X86EMUL_MODE_PROT32| \
> >>>>                     X86EMUL_MODE_PROT64)
> >>>>  
> >>>> +/* CPUID vendors */
> >>>> +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx 0x68747541
> >>>> +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx 0x444d4163
> >>>> +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_edx 0x69746e65
> >>>> +
> >>>> +#define X86EMUL_CPUID_VENDOR_AMDisbetter_ebx 0x69444d41
> >>>> +#define X86EMUL_CPUID_VENDOR_AMDisbetter_ecx 0x21726574
> >>>> +#define X86EMUL_CPUID_VENDOR_AMDisbetter_edx 0x74656273
> >>>> +
> >>>> +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ebx 0x756e6547
> >>>> +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ecx 0x6c65746e
> >>>> +#define X86EMUL_CPUID_VENDOR_GenuineIntel_edx 0x49656e69
> >>>> +
> >>>> +
> >>>> +
> >>>>  enum x86_intercept_stage {
> >>>>      X86_ICTP_NONE = 0,   /* Allow zero-init to not match anything */
> >>>>      X86_ICPT_PRE_EXCEPT,
> >>>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> >>>> index f1e3be1..3357411 100644
> >>>> --- a/arch/x86/kvm/emulate.c
> >>>> +++ b/arch/x86/kvm/emulate.c
> >>>> @@ -1877,6 +1877,94 @@ setup_syscalls_segments(struct x86_emulate_ctxt
> >>>> *ctxt,
> >>>>      ss->p = 1;
> >>>>  }
> >>>>  
> >>>> +static bool em_syscall_isenabled(struct x86_emulate_ctxt *ctxt)
> >>>> +{
> >>>> +    struct x86_emulate_ops *ops = ctxt->ops;
> >>>> +    u64 efer = 0;
> >>>> +
> >>>> +    /* syscall is not available in real mode                            */
> >>>> +    if ((ctxt->mode == X86EMUL_MODE_REAL) ||
> >>>> +        (ctxt->mode == X86EMUL_MODE_VM86))
> >>>> +        return false;
> >>>> +
> >>>> +    ops->get_msr(ctxt, MSR_EFER, &efer);
> >>>> +    /* check - if guestOS is aware of syscall (0x0f05)                  */
> >>>> +    if ((efer & EFER_SCE) == 0) {
> >>>> +        return false;
> >>>> +    } else {
> >>>> +      /* ok, at this point it becomes vendor-specific                   */
> >>>> +      /* so first get us an cpuid                                       */
> >>>> +      bool vendor;
> >>>> +      u32 eax, ebx, ecx, edx;
> >>>> +
> >>>> +      /* getting the cpu-vendor                                         */
> >>>> +      eax = 0x00000000;
> >>>> +      ecx = 0x00000000;
> >>>> +      if (likely(ops->get_cpuid))
> >>>> +          vendor = ops->get_cpuid(ctxt, &eax, &ebx, &ecx, &edx);
> >>>> +      else    vendor = false;
> >>>> +
> >>>> +      if (likely(vendor)) {
> >>>> +
> >>>> +        /* AMD AuthenticAMD / AMDisbetter!                              */
> >>>> +        if (((ebx==X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx) &&
> >>>> +             (ecx==X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx) &&
> >>>> +             (edx==X86EMUL_CPUID_VENDOR_AuthenticAMD_edx)) ||
> >>>> +            ((ebx==X86EMUL_CPUID_VENDOR_AMDisbetter_ebx) &&
> >>>> +             (ecx==X86EMUL_CPUID_VENDOR_AMDisbetter_ecx) &&
> >>>> +             (edx==X86EMUL_CPUID_VENDOR_AMDisbetter_edx))) {
> >>>> +
> >>>> +              /* if cpu is not in longmode...                           */
> >>>> +              /* ...check edx bit11 of cpuid 0x80000001                 */
> >>>> +              if (ctxt->mode != X86EMUL_MODE_PROT64) {
> >>>> +                eax = 0x80000001;
> >>>> +                ecx = 0x00000000;
> >>>> +                vendor = ops->get_cpuid(ctxt, &eax, &ebx, &ecx, &edx);
> >>>> +                if (likely(vendor)) {
> >>>> +                  if (unlikely(((edx >> 11) & 0x1) == 0))
> >>>> +                      return false;
> >>>> +                } else {
> >>>> +                    return false; /* assuming there is no bit 11        */
> >>>> +            }
> >>>> +              }
> >>>> +              goto __em_syscall_enabled_noprotest;
> >>>> +        } /* end "AMD" */
> >>>> +
> >>>> +
> >>>> +        /* Intel (GenuineIntel)                                          */
> >>>> +        /* remarks: Intel CPUs only support "syscall" in 64bit longmode  */
> >>>> +        /*          Also an 64bit guest within a 32bit compat-app running*/
> >>>> +        /*          will #UD !!                                          */
> >>>> +        /*          While this behaviour can be fixed (by emulating) into*/
> >>>> +        /*          an AMD response - CPUs of AMD can't behave like Intel*/
> >>>> +        /*          because without an hardware-raised #UD there is no   */
> >>>> +        /*          call in em.-mode (see x86_emulate_instruction(...))! */
> >>>> +        /* TODO: make AMD-behaviour configurable                         */
> >>>> +        if ((ebx==X86EMUL_CPUID_VENDOR_GenuineIntel_ebx) &&
> >>>> +            (ecx==X86EMUL_CPUID_VENDOR_GenuineIntel_ecx) &&
> >>>> +            (edx==X86EMUL_CPUID_VENDOR_GenuineIntel_edx)) {
> >>>> +              if (ctxt->mode != X86EMUL_MODE_PROT64)
> >>>> +                  return false;
> >>>> +              goto __em_syscall_enabled_noprotest;
> >>>> +        } /* end "Intel" */
> >>>> +
> >>>> +      } /* end vendor is true */
> >>>> +
> >>>> +    } /* end MSR_EFER check */
> >>>> +
> >>>> +    /* default:                               */
> >>>> +    /* this code wasn't able to process vendor */
> >>>> +    /* so apply  Intels stricter rules...      */
> >>>> +        pr_err_ratelimited("kvm: %i: unknown vcpu vendor - assuming
> >>>> intel\n",
> >>>> +                        current->tgid);
> >>>> +
> >>>> +    if (ctxt->mode == X86EMUL_MODE_PROT64) goto
> >>>> __em_syscall_enabled_noprotest;
> >>>> +    return false;
> >>>> +
> >>>> +__em_syscall_enabled_noprotest:
> >>>> +    return true;
> >>>> +}
> >>>> +
> >>>>  static int em_syscall(struct x86_emulate_ctxt *ctxt)
> >>>>  {
> >>>>      struct x86_emulate_ops *ops = ctxt->ops;
> >>>> @@ -1885,9 +1973,7 @@ static int em_syscall(struct x86_emulate_ctxt *ctxt)
> >>>>      u16 cs_sel, ss_sel;
> >>>>      u64 efer = 0;
> >>>>  
> >>>> -    /* syscall is not available in real mode */
> >>>> -    if (ctxt->mode == X86EMUL_MODE_REAL ||
> >>>> -        ctxt->mode == X86EMUL_MODE_VM86)
> >>>> +    if (!(em_syscall_isenabled(ctxt)))
> >>>>          return emulate_ud(ctxt);
> >>>>  
> >>>>      ops->get_msr(ctxt, MSR_EFER, &efer);
> >>> Stephan, 
> >>>
> >>> I think only the 2-line check to inject UD if EFER_SCE is not set is
> >>> missing in em_syscall. The guest is responsible for setting a proper
> >>> MSR_STAR for EIP only if it enables SCE_EFER.
> >>>
> >>> Can you please test & resend that patch? Thanks.
> >>>
> >>>
> >>>
> >> This wouln't work correctly on 64bit longmode guest
> >> running an app in 32bit compat.
> > The AMD pseudo-code, in volume 3 "General-Purpose and System
> > Instructions", says:
> >
> > SYSCALL_START:
> >
> > IF (MSR_EFER.SCE = 0) // Check if syscall/sysret are enabled.
> >     EXCEPTION [#UD]
> >
> > IF (LONG_MODE)
> >     SYSCALL_LONG_MODE
> > ELSE // (LEGACY_MODE)
> >     SYSCALL_LEGACY_MODE
> >
> >> (Even on AMD in some special conditions.)
> >> The reason is, in 64Bit compat the MSR_EFER_SCE is still
> >> enabled but syscall is not always available by cpu in such modes.
> >> (I also compared always with a physical computer's behaviour.)
> >> See Intel or even AMD-doku
> >> (http://support.amd.com/us/Processor_TechDocs/APM_V3_24594.pdf) page 376.
> >>
> >> Of course in 64bit it should NOT crash (and indeed I haven't
> >> observed it, yet) but if I have cpuid GenuineIntel it should behave
> >> like Intel cpu and not like some special AMD.
> >> (Otherwise I could overwrite the vendor and then the patch will
> >> emulate AMD...)
> >>
> >> ??What could be a good speedup: Reordering the EFER check and the
> >> real-mode checks
> >> (I would have to check doku, but EFER is 0 in realmode, isn't it?), then
> >> checking if (not)longmode and then:
> >>
> >> I think, the patch ->only<- (and only!?) becomes a little bit slower
> >> exactly
> >> in this situation (32bit compat under 64bit long) because in every other
> >> case?
> >> the first ifs (testing the MSR_EFER and the mode) should decide to #UD...
> >> (At all it should be faster at the end, then it is now ?)
> >>
> >> ...I'll prepare a patch for what I mean...
> >>
> >> I am looking forward to your response,
> >>
> >> regards Stephan
> > How about this:
> >
> > diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> > index 05a562b..4e8e534 100644
> > --- a/arch/x86/kvm/emulate.c
> > +++ b/arch/x86/kvm/emulate.c
> > @@ -1907,6 +1907,9 @@ static int em_syscall(struct x86_emulate_ctxt *ctxt)
> >  	ops->get_msr(ctxt, MSR_EFER, &efer);
> >  	setup_syscalls_segments(ctxt, &cs, &ss);
> >  
> > +	if (!(efer & EFER_SCE))
> > +		return emulate_ud(ctxt);
> > +
> >  	ops->get_msr(ctxt, MSR_STAR, &msr_data);
> >  	msr_data >>= 32;
> >  	cs_sel = (u16)(msr_data & 0xfffc);
> >
> Hi Marcelo,
> thank you for your fast response.
> 
> Yes, this would be more elegant, ->but<- not exact at least on Intel.
> (http://www.intel.com/content/dam/doc/manual/64-ia-32-architectures-software-developer-manual-325462.pdf)
> 4-586 Vol. 2B or PDF-page 1804
> "[...] (* Not in 64-Bit Mode or SYSCALL/SYSRET not enabled in IA32_EFER
> *) [...]"
> 
> We have to check somehow cpu-vendor to get info what cpu should be
> emulated...
> ...I'll prepare a patch by tomorrow reordering things and compacting
> checks on AMD's branch.
> (obviously cpuid 0x80000001 edx bit 11 needs not to be checked, as far
> as the pseudocode tells... )

Right. Please have the EFER check as i suggested (its cleaner), and
then only this check for Intel should suffice:

+        /* TODO: make AMD-behaviour configurable */
+        if ((ebx==X86EMUL_CPUID_VENDOR_GenuineIntel_ebx) &&
+            (ecx==X86EMUL_CPUID_VENDOR_GenuineIntel_ecx) &&
+            (edx==X86EMUL_CPUID_VENDOR_GenuineIntel_edx)) {
+              if (ctxt->mode != X86EMUL_MODE_PROT64)

Please mind Documentation/CodingStyle. Thanks.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux