On Thu, 2010-02-25 at 10:20 +0100, Peter Zijlstra wrote: > On Thu, 2010-02-25 at 11:27 +0800, Zhang, Yanmin wrote: > > Ingo, > > > > I did some testing with KVM virtualization. perf shows vmx_vcpu_run > > consumes more than 50% cpu time. Actually, the info is incorrect because > > when perf counter overflows and NMI is triggered, vm exit to function > > vmx_vcpu_run, then vmx_vcpu_run triggers a software NMI so perf event is > > notified. perf just checks regs which just saves the address of vmx_vcpu_run. > > > > I want to enhance perf to collect real guest os address. > > > > Below is the design. > > KVM uses multi-thread model. Every guest os is a process of multi-thread. > > > > 1) Kernel: > > Add a per_cpu var and some functions, so KVM records interrupted > > guest os address before triggering the software NMI. perf event would check > > the per_cpu var to use it if it's not zero, or just goes though the old path. > > > > 2) User space: Add a new parameter to perf-top and perf-report, such like > > -g pid:guest_os_vmlinux_path. Command perf parses the guest os kernel image > > to collect symbols. Change perf to summarize results based on pid. > > Another direction is to use the new parameter -g only when old parameter > > -p is defined. Perf just needs separate native kernel and guest os kernel. > I really appreciate your kind comments, and will contact you again in the future for help. > -g is already taken :-) We could use other flag or just -G. > > One thing I worry about is making sense of the guest data, it might be > possible to sorta make sense of the main kernel image, but after that > its going to be 'interesting' in deed. > > You're going to have to extend PERF_RECORD_MISC_* though, perhaps you > can reuse CPUMODE_UNKNOWN for GUEST. > > The callchain stuff already has GUEST context identifiers, however > determining KERNEL/USER context might be hard and interpreting it is > going to be harder still since we don't have map information for the > guest. Right. As for side #1 pointed in Ingo' email, we assume guest os is linux. We couldn't support all capabilities of perf on KVM from host side. 1) We couldn't get module and process mapping info in guest os in an easy way, so we can't support to collect guest kernel module and user space hot functions. A work around is user could get guest os /proc/kallsym and pass it to tool perf at host side so we could analyze module host functions. 2) We couldn't get guest os kernel/user stack data in an easy way, so we might not support callchain feature of tool perf. A work around is KVM copies kernel stack data out, so we could at least support guest os kernel callchain. So the host side perf support on guest os: perf kvm list perf kvm record # records the first running guest perf kvm stat # stats the first running KVM guest perf kvm top # shows the profile of the first running guest perf kvm trace # active the KVM specific tracepoints As for record, doesn't support to record guest os user space stack callchain and guest os user space hot functions. Yanmin -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html