On Wed, Mar 30, 2011 at 03:29:02PM +0200, Avi Kivity wrote: > On 03/30/2011 03:26 PM, Gleb Natapov wrote: > >On Wed, Mar 30, 2011 at 02:48:28PM +0200, Gleb Natapov wrote: > >> On Wed, Mar 30, 2011 at 02:17:55PM +0200, Avi Kivity wrote: > >> > On 03/30/2011 01:43 PM, Gleb Natapov wrote: > >> > >After reboot perf started to work. I ran modified emulator.flat unit > >> > >test. It was modified to run test_cmps() in an endless loop. > >> > > > >> > >Without patch: > >> > >1.71% qemu-system-x86 [kvm] [k] x86_emulate_instruction > >> > >1.51% qemu-system-x86 [kvm] [k] x86_emulate_instruction > >> > >1.68% qemu-system-x86 [kvm] [k] x86_emulate_instruction > >> > > > >> > >With patch: > >> > >0.84% qemu-system-x86 [kvm] [k] x86_emulate_instruction > >> > >0.96% qemu-system-x86 [kvm] [k] x86_emulate_instruction > >> > >0.89% qemu-system-x86 [kvm] [k] x86_emulate_instruction > >> > > > >> > > >> > The cause might be kvm_rip_write() using vmwrite. Can you use perf > >> > to see where the hits are in x86_emulate_instruction? > >> > > >> > If that's the case, we may be able to do local optimizations to > >> > kvm_rip_write(), kvm_set_rflags(), and toggle_interruptiblity() > >> > instead of this global change. > >> > > >> I can leave copying there and eliminate only kvm_rip_write and see > >> perf data. > >> > > > >1.75% qemu-system-x86 [kvm] [k] x86_emulate_instruction > >1.60% qemu-system-x86 [kvm] [k] x86_emulate_instruction > >1.42% qemu-system-x86 [kvm] [k] x86_emulate_instruction > > > >This is with copy in place, but those are under if (writeback): > > toggle_interruptibility(vcpu, > > vcpu->arch.emulate_ctxt.interruptibility); > > kvm_set_rflags(vcpu, vcpu->arch.emulate_ctxt.eflags); > > kvm_make_request(KVM_REQ_EVENT, vcpu); > > vcpu->arch.emulate_regs_need_sync_to_vcpu = false; > > kvm_rip_write(vcpu, vcpu->arch.emulate_ctxt.eip); > > > > It's wierd. Do you get perf hits in the copying? > How can I check. The memcpy is inlined. > Copying a couple of hot cache lines shouldn't take any measurable > time compared to a heavyweight exit. > The whole function takes only 1.5% CPU. Perf measures how much this function become faster and heavyweight exit is not part of the function. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html