Am 30.01.2012 11:24, schrieb Avi Kivity: > On 01/27/2012 09:23 PM, Kevin Wolf wrote: >> Task switches can switch between Protected Mode and VM86. The current >> mode must be updated during the task switch emulation so that the new >> segment selectors are interpreted correctly and privilege checks >> succeed. >> >> VMX code calculates the CPL from the code segment selector and rflags, >> so it needs rflags to be updated in the vcpu struct. SVM stores the DPL >> of the code segment instead, so we must be sure to give the right one >> when updating the selector. > > svm has vmcb_save_area::cpl; it's independent of CS.RPL. Right, but cpl in the VMCB is updated when you reload a segment (and I think only then), so it's closely related. >> Signed-off-by: Kevin Wolf <kwolf@xxxxxxxxxx> >> --- >> arch/x86/include/asm/kvm_emulate.h | 1 + >> arch/x86/kvm/emulate.c | 26 ++++++++++++++++++++++++++ >> arch/x86/kvm/x86.c | 6 ++++++ >> 3 files changed, 33 insertions(+), 0 deletions(-) >> >> diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h >> index c8a9cf3..4a21c7d 100644 >> --- a/arch/x86/include/asm/kvm_emulate.h >> +++ b/arch/x86/include/asm/kvm_emulate.h >> @@ -176,6 +176,7 @@ struct x86_emulate_ops { >> void (*set_idt)(struct x86_emulate_ctxt *ctxt, struct desc_ptr *dt); >> ulong (*get_cr)(struct x86_emulate_ctxt *ctxt, int cr); >> int (*set_cr)(struct x86_emulate_ctxt *ctxt, int cr, ulong val); >> + void (*set_rflags)(struct x86_emulate_ctxt *ctxt, ulong val); >> int (*cpl)(struct x86_emulate_ctxt *ctxt); >> int (*get_dr)(struct x86_emulate_ctxt *ctxt, int dr, ulong *dest); >> int (*set_dr)(struct x86_emulate_ctxt *ctxt, int dr, ulong value); >> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c >> index 833969e..143ce8e 100644 >> --- a/arch/x86/kvm/emulate.c >> +++ b/arch/x86/kvm/emulate.c >> @@ -552,6 +552,14 @@ static void set_segment_selector(struct x86_emulate_ctxt *ctxt, u16 selector, >> struct desc_struct desc; >> >> ctxt->ops->get_segment(ctxt, &dummy, &desc, &base3, seg); >> + >> + if (ctxt->mode == X86EMUL_MODE_REAL) >> + desc.dpl = 0; > > Can't happen. set_segment_selector is only called during task switches right now, so yes, this is impossible. Want me to drop the check? Or BUG()? >> + else if (ctxt->mode == X86EMUL_MODE_VM86) >> + desc.dpl = 3; >> + else >> + desc.dpl = selector & 0x3; > > I don't think this is right. The SDM explicitly says that just the > selectors are loaded, yet you're modifying the DPL here too. Unless > there's an abort, we'll be loading the descriptors later anyway (with > their DPL). If there's an abort, I think we should continue with the > mismatched DPL. > >> + >> ctxt->ops->set_segment(ctxt, selector, &desc, base3, seg); >> } >> >> @@ -2273,6 +2281,24 @@ static int load_state_from_tss32(struct x86_emulate_ctxt *ctxt, >> return emulate_gp(ctxt, 0); >> ctxt->_eip = tss->eip; >> ctxt->eflags = tss->eflags | 2; >> + >> + /* >> + * If we're switching between Protected Mode and VM86, we need to make >> + * sure to update the mode before loading the segment descriptors so >> + * that the selectors are interpreted correctly. >> + * >> + * Need to get it to the vcpu struct immediately because it influences >> + * the CPL which is checked at least when loading the segment >> + * descriptors and when pushing an error code to the new kernel stack. >> + */ >> + if (ctxt->eflags & X86_EFLAGS_VM) >> + ctxt->mode = X86EMUL_MODE_VM86; >> + else >> + ctxt->mode = X86EMUL_MODE_PROT32; >> + > > Shouldn't this be done after the set_segment_selector() block? My > interpretation of the SDM is that if a fault happens while loading > descriptors the fault happens with old segment cache, that is, it needs > to be interpreted according to the old mode. This is closely related to my argument with Gleb whether CPL changes when rflags is reloaded or if it only happens when cs is reloaded. I argued that it makes more sense to couple it with cs, so I guess I have to agree with you mostly. The SDM says that any faults during loading the segment descriptors happen in the context of the new task, and the task switch is completed before an exception is generated. So it shouldn't make any difference to the guest what the exact point is at which we change CPL, it's an KVM internal decision. So what about this order: 1. Reload general purpose registers and eflags without updating mode or writing back rflags to the vcpu struct. 2. Load segment selectors without changing the DPL yet. 3. Load segment descriptors, and disable the privilege checks in load_segment_descriptor using a new flag. For SVM, this updates the CPL when cs is reloaded. 4. Call ctxt->ops.set_rflag so that VMX updates the CPL. Should be cleaned up in a follow-up patch, so that VMX uses set_segment to update the CPL, like SVM does. Would this match your interpretation? >> + ctxt->ops->set_rflags(ctxt, ctxt->eflags); >> + >> + /* General purpose registers */ >> ctxt->regs[VCPU_REGS_RAX] = tss->eax; >> ctxt->regs[VCPU_REGS_RCX] = tss->ecx; >> ctxt->regs[VCPU_REGS_RDX] = tss->edx; >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index dc3e945..502b5c3 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -4040,6 +4040,11 @@ static int emulator_set_cr(struct x86_emulate_ctxt *ctxt, int cr, ulong val) >> return res; >> } >> >> +static void emulator_set_rflags(struct x86_emulate_ctxt *ctxt, ulong val) >> +{ >> + kvm_set_rflags(emul_to_vcpu(ctxt), val); >> +} >> + >> static int emulator_get_cpl(struct x86_emulate_ctxt *ctxt) >> { >> return kvm_x86_ops->get_cpl(emul_to_vcpu(ctxt)); >> @@ -4199,6 +4204,7 @@ static struct x86_emulate_ops emulate_ops = { >> .set_idt = emulator_set_idt, >> .get_cr = emulator_get_cr, >> .set_cr = emulator_set_cr, >> + .set_rflags = emulator_set_rflags, >> .cpl = emulator_get_cpl, >> .get_dr = emulator_get_dr, >> .set_dr = emulator_set_dr, > > It would be good to switch the entire emulator to use ->set_rflags() > instead of sometimes letting the caller do it. If we change the CPL only with a cs reload, set_rflags can be dropped completely in the end. Kevin -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html