On 11.04.2010, at 23:07, Andre Przywara wrote: > On SVM we set the instruction length of skipped instructions > to hard-coded, well known values, which could be wrong when (bogus, > but valid) prefixes (REX, segment override) are used. > Newer AMD processors (Fam10h 45nm and better, aka. PhenomII or > AthlonII) have an explicit NEXTRIP field in the VMCB containing the > desired information. > Since it is cheap to do so, we use this field to override the guessed > value on newer processors. > A fix for older CPUs would be rather expensive, as it would require > to fetch and partially decode the instruction. As the problem is not > a security issue and needs special, handcrafted code to trigger > (no compiler will ever generate such code), I omit a fix for older > CPUs. > If someone is interested, I have both a patch for these CPUs as well as > demo code triggering this issue: It segfaults under KVM, but runs > perfectly on native Linux. > > Signed-off-by: Andre Przywara <andre.przywara@xxxxxxx> > --- > arch/x86/include/asm/svm.h | 4 +++- > arch/x86/kvm/svm.c | 13 ++++++++----- > 2 files changed, 11 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h > index b26a38d..1d91d05 100644 > --- a/arch/x86/include/asm/svm.h > +++ b/arch/x86/include/asm/svm.h > @@ -81,7 +81,9 @@ struct __attribute__ ((__packed__)) vmcb_control_area { > u32 event_inj_err; > u64 nested_cr3; > u64 lbr_ctl; > - u8 reserved_5[832]; > + u64 reserved_5; > + u64 next_rip; > + u8 reserved_6[816]; > }; > > > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c > index d04c7ad..7fff56c 100644 > --- a/arch/x86/kvm/svm.c > +++ b/arch/x86/kvm/svm.c > @@ -43,11 +43,11 @@ MODULE_LICENSE("GPL"); > #define SEG_TYPE_LDT 2 > #define SEG_TYPE_BUSY_TSS16 3 > > -#define SVM_FEATURE_NPT (1 << 0) > -#define SVM_FEATURE_LBRV (1 << 1) > -#define SVM_FEATURE_SVML (1 << 2) > -#define SVM_FEATURE_NRIP (1 << 3) > -#define SVM_FEATURE_PAUSE_FILTER (1 << 10) > +#define SVM_FEATURE_NPT (1 << 0) > +#define SVM_FEATURE_LBRV (1 << 1) > +#define SVM_FEATURE_SVML (1 << 2) > +#define SVM_FEATURE_NRIP (1 << 3) > +#define SVM_FEATURE_PAUSE_FILTER (1 << 10) This is pure indent fixing, right? Sounds like a separate patch to me. > > #define NESTED_EXIT_HOST 0 /* Exit handled on host level */ > #define NESTED_EXIT_DONE 1 /* Exit caused nested vmexit */ > @@ -319,6 +319,9 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu) > { > struct vcpu_svm *svm = to_svm(vcpu); > > + if (svm->vmcb->control.next_rip != 0) > + svm->next_rip = svm->vmcb->control.next_rip; > + Wouldn't it be more obvious to create a wrapper function to set next_rip in all the callers instead of setting it manually and then have magic override the value you just set? Let me illustrate what I mean. Currently we have: static int halt_interception(struct vcpu_svm *svm) { svm->next_rip = kvm_rip_read(&svm->vcpu) + 1; skip_emulated_instruction(&svm->vcpu); return kvm_emulate_halt(&svm->vcpu); } skip_emulated_instruction sets pc = next_rip. All fine and obvious. With your patch it would simply ignore next_rip, rendering the line before void without obvious indication of that behavior. So what I'd prefer is something like: /* Either adds offset n to the instruction counter or takes the next instruction pointer from the vmcb if the CPU supports it */ static u64 svm_next_rip(struct kvm_vcpu *vcpu, int add) { if (svm->vmcb->control.next_rip != 0) return svm->vmcb->control.next_rip; return kvm_rip_read(&svm->vcpu) + add; } static int halt_interception(struct vcpu_svm *svm) { svm->next_rip = svm_next_rip(&svm->vcpu, 1); skip_emulated_instruction(&svm->vcpu); return kvm_emulate_halt(&svm->vcpu); } That is more readable IMHO and makes the code flow a lot more obvious. Alex-- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html