On Mon, Feb 14, 2022 at 01:09:05PM +0200, Adrian Hunter wrote: > Currently, using Intel PT to trace a VM guest is limited to kernel space > because decoding requires side band events such as MMAP and CONTEXT_SWITCH. > While these events can be collected for the host, there is not a way to do > that yet for a guest. One approach, would be to collect them inside the > guest, but that would require being able to synchronize with host > timestamps. > > The motivation for this patch is to provide a clock that can be used within > a VM guest, and that correlates to a VM host clock. In the case of TSC, if > the hypervisor leaves rdtsc alone, the TSC value will be subject only to > the VMCS TSC Offset and Scaling. Adjusting for that would make it possible > to inject events from a guest perf.data file, into a host perf.data file. > > Thus making possible the collection of VM guest side band for Intel PT > decoding. > > There are other potential benefits of TSC as a perf event clock: > - ability to work directly with TSC > - ability to inject non-Intel-PT-related events from a guest > > Signed-off-by: Adrian Hunter <adrian.hunter@xxxxxxxxx> > --- > arch/x86/events/core.c | 16 +++++++++ > arch/x86/include/asm/perf_event.h | 3 ++ > include/uapi/linux/perf_event.h | 12 ++++++- > kernel/events/core.c | 57 +++++++++++++++++++------------ > 4 files changed, 65 insertions(+), 23 deletions(-) > > diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c > index e686c5e0537b..51d5345de30a 100644 > --- a/arch/x86/events/core.c > +++ b/arch/x86/events/core.c > @@ -2728,6 +2728,17 @@ void arch_perf_update_userpage(struct perf_event *event, > !!(event->hw.flags & PERF_EVENT_FLAG_USER_READ_CNT); > userpg->pmc_width = x86_pmu.cntval_bits; > > + if (event->attr.use_clockid && > + event->attr.ns_clockid && > + event->attr.clockid == CLOCK_PERF_HW_CLOCK) { > + userpg->cap_user_time_zero = 1; > + userpg->time_mult = 1; > + userpg->time_shift = 0; > + userpg->time_offset = 0; > + userpg->time_zero = 0; > + return; > + } > + > if (!using_native_sched_clock() || !sched_clock_stable()) > return; This looks the wrong way around. If TSC is found unstable, we should never expose it. And I'm not at all sure about the whole virt thing. Last time I looked at pvclock it made no sense at all.