2017-06-27 18:09 GMT+08:00 Paolo Bonzini <pbonzini@xxxxxxxxxx>: > > > On 27/06/2017 11:50, Wanpeng Li wrote: >> 2017-06-27 16:20 GMT+08:00 Paolo Bonzini <pbonzini@xxxxxxxxxx>: >>> >>> >>> On 27/06/2017 05:41, Wanpeng Li wrote: >>>>> KVM emulates syscall so that it can trap 32-bit syscall on Intel processors. >>>> >>>> We have a discussion to not expose syscall/sysret to Intel 32-bit >>>> guest two years ago. https://lkml.org/lkml/2015/11/19/225 The >>>> syscall/sysret just makes sense against long mode instead of >> >> s/long mode/64-bit mode >> >>>> compatibility/legacy mode of Intel CPU. We will get a #UD in 32-bit >>>> guest, and syscall emulation is introduced by commit 66bb2ccd (KVM: >>>> x86 emulator: add syscall emulation) to handle it. So why we still >>>> expose syscall/sysret to Intel 32-bit guest? >>> >>> Because you didn't post v2 of that patch, I guess. :) >>> >>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >>>>> index 87d3cb901935..0e846f0cb83b 100644 >>>>> --- a/arch/x86/kvm/x86.c >>>>> +++ b/arch/x86/kvm/x86.c >>>>> @@ -5313,6 +5313,8 @@ static void init_emulate_ctxt(struct kvm_vcpu *vcpu) >>>>> kvm_x86_ops->get_cs_db_l_bits(vcpu, &cs_db, &cs_l); >>>>> >>>>> ctxt->eflags = kvm_get_rflags(vcpu); >>>>> + ctxt->tf = (ctxt->eflags & X86_EFLAGS_TF) != 0; >>>>> + >>>> >>>> I guess this is used for "the sysret is executed the #DB is taken "as >>>> if" the syscall insn just completed", however, there is no sysret >>>> emulation, so how the #DB is taken after the sysret? >>> >>> No, it's used for instructions other than syscall and sysret: >>> >>>> + if (r == EMULATE_DONE && >>>> + (ctxt->tf || (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP))) >>>> + kvm_vcpu_do_singlestep(vcpu, &r); >>> >>> syscall (and sysret if it were emulated) overwrite ctxt->tf with the >>> value of TF at the end of the instruction. Other instructions don't, so >>> that singlestep depends on EFLAGS.TF before the instruction is executed. >> >> Why sysret is not emulated since SDM said that it can incur a #UD if >> not in 64-bit mode? > > "64-bit ring 0 to 32-bit ring 3" sysret ("sysretl") is supported by Intel: > > IF (operand size is 64-bit) > THEN CS.Selector ← IA32_STAR[63:48]+16; > ELSE CS.Selector ← IA32_STAR[63:48]; > FI; I see. :) Regards, Wanpeng Li