Re: [PATCH v28 0/8] Timestamp synchronization of host - guest tracing session

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 9, 2021 at 2:24 PM Dario Faggioli <dfaggioli@xxxxxxxx> wrote:
>
> On Tue, 2021-02-09 at 13:44 +0200, Tzvetomir Stoyanov wrote:
> > Hi Dario,
> >
> Hi again,
>
> > On Tue, Feb 9, 2021 at 9:28 AM Dario Faggioli <dfaggioli@xxxxxxxx>
> > wrote:
> > >
> > > Or at least, host and guest talk, and I see this: "Negotiated kvm
> > > time
> > > sync protocol with guest (null)" (is that "(null)" ok/intentional?
> > > I
> > > guess I can check the code... :-))
> >
> > That null looks like an error, the name of the guest VM should be
> > printed there.
> >
> Yep, in fact, as I said, I tired this before and --now that I think
> about it-- I do remember that the guest "name" was what was there.
>
> BTW, if I can pester you a little bit more, now that it works I'm
> trying to use this, and here's what I'm doing.
>
> == VM:
> # echo tsc > /sys/devices/system/clocksource/clocksource0/current_clocksource
> # echo x86-tsc > /sys/kernel/tracing/trace_clock
> # trace-cmd agent
>
> == Host:
> # echo x86-tsc > /sys/kernel/tracing/trace_clock
> # trace-cmd record -p nop -e sched:* -e kvm:* -e timer:hrtimer* -A tumbleweed-jeos:823 -e sched:* -e syscalls:*_clock_nanosleep -e timer:hrtimer* sleep 1
>
> And I do end up with the two files: trace.dat for the host and trace-
> (null).dat (oh, well :-D) for the guest.
>
> Now, just `trace-cmd report`-ing them, I get this (just some small
> fragments, of course).
>
> == Host (trace.dat):
>            <...>-24791 [001]308219686938467: sched_switch:         prev_comm=sleep prev_pid=24791 prev_prio=120 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120
>           <idle>-0     [001]308219691836595: hrtimer_cancel:       hrtimer=0xffff96694faec710
>           <idle>-0     [001]308219691837544: hrtimer_expire_entry: hrtimer=0xffff96694faec710 function=apic_timer_fn now=76856318840891
>           <idle>-0     [001]308219691840271: sched_waking:         comm=CPU 0/KVM pid=9196 prio=120 target_cpu=001
>           <idle>-0     [001]308219691849474: sched_wakeup:         comm=CPU 0/KVM pid=9196 prio=120 target_cpu=001
>           <idle>-0     [001]308219691850914: hrtimer_expire_exit:  hrtimer=0xffff96694faec710
>           <idle>-0     [001]308219691856548: sched_switch:         prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=CPU 0/KVM next_pid=9196 next_prio=120
>        CPU 0/KVM-9196  [001]308219691876512: kvm_vcpu_wakeup:      wait time 2352265 ns, polling valid
>        CPU 0/KVM-9196  [001]308219691880170: kvm_apic_accept_irq:  apicid 0 vec 236 (Fixed|edge)
>        CPU 0/KVM-9196  [001]308219691884210: kvm_pv_tlb_flush:     vcpu 0 need_flush_tlb false
>
>
>  == Guest (trace-(null).dat):
>            <...>-12068 [000]118613084443085: sched_stat_runtime:   comm=trace-cmd pid=12068 runtime=97403 [ns] vruntime=5101562006274 [ns]
>            <...>-12068 [000]118613084451479: sched_switch:         prev_comm=trace-cmd prev_pid=12068 prev_prio=120 prev_state=S ==> next_comm=trace-cmd next_pid=12070 next_prio=120
>            <...>-12070 [000]118613084662730: sched_stat_runtime:   comm=trace-cmd pid=12070 runtime=55912 [ns] vruntime=5101564964781 [ns]
>            <...>-12070 [000]118613084665219: sched_switch:         prev_comm=trace-cmd prev_pid=12070 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120
>           <idle>-0     [000]118613085002791: sched_waking:         comm=kworker/0:0 pid=10458 prio=120 target_cpu=000
>           <idle>-0     [000]118613085006567: sched_wakeup:         comm=kworker/0:0 pid=10458 prio=120 target_cpu=000
>           <idle>-0     [000]118613085011740: sched_switch:         prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=kworker/0:0 next_pid=10458 next_prio=120
>      kworker/0:0-10458 [000]118613085023034: sched_waking:         comm=trace-cmd pid=12070 prio=120 target_cpu=000
>      kworker/0:0-10458 [000]118613085023774: sched_stat_runtime:   comm=kworker/0:0 pid=10458 runtime=4848 [ns] vruntime=5101561969629 [ns]
>      kworker/0:0-10458 [000]118613085025356: sched_wakeup:         comm=trace-cmd pid=12070 prio=120 target_cpu=000
>
> So, the timestamps seems still a bit apart to me. Is this fine, and
> hence there is some more post-processing that I should do (and, if yes,
> what?). Or should they be already in sync?

The timestamps, recorded in both files are not synchronised. That's
why you are seeing the original timestamps. The offsets between both
clocks are recorded in the metadata of the guest trace file. You can verify
this using "trace-cmd dump --options trace-guest.dat" and
"trace-cmd dump --options trace.dat"
When both files are opened together, then that metadata is used to align
guest's timestamps to the host time. Currently trace-cmd report does not
have support for opening more trace files as a session, yet. This is in our
short todo list.

>
> I've also tried to open trace.dat in kernelshark2, and then append
> trace-(null).dat but from the few that I see (as said in my other
> email, kernelshark2 is not fully functional for me) the events from the
> two traces are still completely separated.

Currently using KernelShark2-beta is the only way to display host and
guests tracing files, as a session. There is logic for synchronising timestamps,
using metadata from the files. What repo are you using for getting the
KernelShark2-beta ?
Yordan can give more details on that.

>
> I also tried with the following, but it's still the same:
> # echo kvm-clock > /sys/devices/[...]/current_clocksource in the guest
>
> Then, I've removed the last patch of the series, the one for "KVM
> clock", after which, I was correctly (I think) seeing:
>
> "Negotiated ptp time sync protocol with guest (null)"

That "null" really bothers me, as I cannot reproduce it. The logic for
getting the
guest name, cid and port uses "/proc/<qemu pid>/cmdline". Please, can you send
me that string, from /proc/... or just "ps afx | grep qemu", to see
what is wrong with
our parsing logic.
Also, if you have any ideas how to get name, cid and port of running
libvirt VMs -
please share. The current approach parsing the qemu command line is a work
around, not a final solution.

Thanks!

>
> But then again:
>
> == Host:
>        CPU 0/KVM-9196  [001]249509851092735: kvm_exit:             vcpu 0 reason PREEMPTION_TIMER rip 0x7f0da050b357 info1 0x0000000000000000 info2 0x0000000000000000 intr_info 0x00000000 error_code 0x00000000
>        CPU 0/KVM-9196  [001]249509851096316: kvm_apic_accept_irq:  apicid 0 vec 236 (Fixed|edge)
>        CPU 0/KVM-9196  [001]249509851112525: kvm_exit:             vcpu 0 reason MSR_WRITE rip 0xffffffffac475164 info1 0x0000000000000000 info2 0x0000000000000000 intr_info 0x00000000 error_code 0x00000000
>        CPU 0/KVM-9196  [001]249509851114034: kvm_hv_timer_state:   vcpu_id 0 hv_timer 1
>        CPU 0/KVM-9196  [001]249509851115032: kvm_msr:              msr_write 6e0 = 0x367b512478b0
>            sleep-26457 [000]249509851765258: sched_process_exec:   filename=/usr/bin/sleep pid=26457 old_pid=26457
>        CPU 0/KVM-9196  [001]249509853247850: kvm_exit:             vcpu 0 reason EXTERNAL_INTERRUPT rip 0xffffffffad00015d info1 0x0000000000000000 info2 0x0000000000000000 intr_info 0x800000ec error_code 0x00000000
>        CPU 0/KVM-9196  [001]249509853250399: hrtimer_cancel:       hrtimer=0xffff96709fc61980
>
> == Guest:
>            <...>-12891 [000]125778006088605: sched_stat_runtime:   comm=trace-cmd pid=12891 runtime=84358 [ns] vruntime=5110265291610 [ns]
>            <...>-12891 [000]125778006095698: sched_switch:         prev_comm=trace-cmd prev_pid=12891 prev_prio=120 prev_state=S ==> next_comm=trace-cmd next_pid=12893 next_prio=120
>            <...>-12893 [000]125778006320003: sched_stat_runtime:   comm=trace-cmd pid=12893 runtime=58448 [ns] vruntime=5110268246764 [ns]
>            <...>-12893 [000]125778006322729: sched_switch:         prev_comm=trace-cmd prev_pid=12893 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120
>           <idle>-0     [000]125778021774505: hrtimer_cancel:       hrtimer=0xffff99d67bc21980
>           <idle>-0     [000]125778021776481: hrtimer_expire_entry: hrtimer=0xffff99d67bc21980 function=tick_sched_timer now=31370890644808
>           <idle>-0     [000]125778021782895: hrtimer_expire_exit:  hrtimer=0xffff99d67bc21980
>
> So, what am I missing? :-)
>
> And, of course, thanks a lot again!
> --
> Dario Faggioli, Ph.D
> http://about.me/dario.faggioli
> Virtualization Software Engineer
> SUSE Labs, SUSE https://www.suse.com/
> -------------------------------------------------------------------
> <<This happens because _I_ choose it to happen!>> (Raistlin Majere)



-- 
Tzvetomir (Ceco) Stoyanov
VMware Open Source Technology Center



[Index of Archives]     [Linux USB Development]     [Linux USB Development]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux