Re: Question on skip_emulated_instructions()

Gleb Natapov <gleb@xxxxxxxxxx> · Wed, 7 Apr 2010 18:43:25 +0300



On Wed, Apr 07, 2010 at 03:25:10PM +0900, Yoshiaki Tamura wrote:
> 2010/4/6 Gleb Natapov <gleb@xxxxxxxxxx>:
> > On Tue, Apr 06, 2010 at 01:11:23PM +0900, Yoshiaki Tamura wrote:
> >> Hi.
> >>
> >> When handle_io() is called, rip is currently proceeded *before* actually having
> >> I/O handled by qemu in userland.  Upon implementing Kemari for
> >> KVM(http://www.mail-archive.com/kvm@xxxxxxxxxxxxxxx/msg25141.html) mainly in
> >> userland qemu, we encountered a problem that synchronizing the content of VCPU
> >> before handling I/O in qemu is too late because rip is already proceeded in KVM,
> >> Although we avoided this issue with temporal hack, I would like to ask a few
> >> question on skip_emulated_instructions.
> >>
> >> 1. Does rip need to be proceeded before having I/O handled by qemu?
> > In current kvm.git rip is proceeded before I/O is handled by qemu only
> > in case of "out" instruction. From architecture point of view I think
> > it's OK since on real HW you can't guaranty that I/O will take effect
> > before instruction pointer is advanced. It is done like that because we
> > want "out" emulation to be real fast so we skip x86 emulator.
> 
> Thanks for your reply.
> 
> If proceeding rip later doesn't break the behavior of devices or
> introduce slow down, I would like that to be done.
> 
Device can not care less about what value rip register currently has.
Why is it matters for you code?

> >> 2. If no, is it possible to divide skip_emulated_instructions(), like
> >> rec_emulated_instructions() to remember to next_rip, and
> >> skip_emulated_instructions() to actually proceed the rip.
> > Currently only emulator can call userspace to do I/O, so after
> > userspace returns after I/O exit, control is handled back to emulator
> > unconditionally.  "out" instruction skips emulator, but there is nothing
> > to do after userspace returns, so regular cpu loop is executed. If we
> > want to advance rip only after userspace executed I/O done by "out" we
> > need to distinguish who requested I/O (emulator or kvm_fast_pio_out())
> > and call different code depending on who that was. It can be done by
> > having a callback that (if not null) is called on return from userspace.
> 
> Your suggestion is to introduce a callback entry, and instead of
> calling kvm_rip_write(), set it to the entry before calling
> kvm_fast_pio_out(),
> and check the entry upon return from the userspace, correct?
> 
Something like that, yes.

> According to the comment in x86.c, when it was "out" instruction
> vcpu->arch.pio.count is set to 0 to skip the emulator.
> To call kvm_fast_pio_out(), "!string" and "!in" must be set.
> If we can check, vcpu->arch.pio.count, "string" and "in" on return
> from the userspace, can't we distinguish who requested I/O, emulator
> or kvm_fast_pio_out()?
> 
May be, but callback approach is much cleaner. "string" and "in" can have
stale data for instance.

> >> 3. svm has next_rip but when it is 0, nop is emulated.  Can this be modified to
> >> continue without emulating nop when next_rip is 0?
> >>
> > I don't see where nop is emulated if next_rip is 0. As far as I see in
> > case of next_rip==0 an instruction at rip is decoded to figure out its
> > length and then rip is advanced by instruction length. Anyway next_rip
> > is svm thing only.
> 
> Sorry.  I wasn't understanding the code enough.
> 
> static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
> {
> ...
> 	if (!svm->next_rip) {
> 		if (emulate_instruction(vcpu, 0, 0, EMULTYPE_SKIP) !=
> 				EMULATE_DONE)
> 			printk(KERN_DEBUG "%s: NOP\n", __func__);
> 		return;
> 	}
> 
> Since the printk says NOP, I thought emulate_instruction was doing so...
> 
> The reason I asked about next_rip is because I was hoping to use this
> entry to advance rip only after userspace executed I/O done by "out",
> like if next_rip is !0,
> call kvm_rip_write(), and introduce next_rip to vmx if it is usable
> because vmx is
> currently using local variable rip.
> 
> Yoshi

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html