On Sun, 08 Oct 2023 15:48:17 +0100, Tianyi Liu <i.pear@xxxxxxxxxxx> wrote: > > Hi there, > > This series of patches enables callchains for guests (used by perf kvm), > which holds the top spot on the perf wiki TODO list [1]. This allows users > to perform guest OS callchain or performance analysis from external > using PMU events. > > The event processing flow is as follows (shown as backtrace): > #0 kvm_arch_vcpu_get_frame_pointer / kvm_arch_vcpu_read_virt (per arch) > #1 kvm_guest_get_frame_pointer / kvm_guest_read_virt > <callback function pointers in `struct perf_guest_info_callbacks`> > #2 perf_guest_get_frame_pointer / perf_guest_read_virt > #3 perf_callchain_guest > #4 get_perf_callchain > #5 perf_callchain > > Between #0 and #1 is the interface between KVM and the arch-specific > impl, while between #1 and #2 is the interface between Perf and KVM. > The 1st patch implements #0. The 2nd patch extends interfaces between #1 > and #2, while the 3rd patch implements #1. The 4th patch implements #3 > and modifies #4 #5. The last patch is for userspace utils. > > Since arm64 hasn't provided some foundational infrastructure (interface > for reading from a virtual address of guest), the arm64 implementation > is stubbed for now because it's a bit complex, and will be implemented > later. I hope you realise that such an "interface" would be, by definition, fragile and very likely to break in a subtle way. The only existing case where we walk the guest's page tables is for NV, and even that is extremely fragile. Given that, I really wonder why this needs to happen in the kernel. Userspace has all the required information to interrupt a vcpu and walk its current context, without any additional kernel support. What are the bits here that cannot be implemented anywhere else? > > Tested with both 32-bit and 64-bit guest operating systems / unikernels, > that `perf script` could correctly show the certain callchains. > FlameGraphs can also be generated with this series of patches and [2]. > > Any feedback will be greatly appreciated. > > [1] https://perf.wiki.kernel.org/index.php/Todo > [2] https://github.com/brendangregg/FlameGraph > > v1: > https://lore.kernel.org/kvm/SYYP282MB108686A73C0F896D90D246569DE5A@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ > > Changes since v1: > - v1 only includes partial KVM modifications, while v2 is a complete > implementation. Also updated based on Sean's feedback. > > Tianyi Liu (5): > KVM: Add arch specific interfaces for sampling guest callchains > perf kvm: Introduce guest interfaces for sampling callchains > KVM: implement new perf interfaces > perf kvm: Support sampling guest callchains > perf tools: Support PERF_CONTEXT_GUEST_* flags > > arch/arm64/kvm/arm.c | 17 +++++++++ Given that there is more to KVM than just arm64 and x86, I suggest that you move the lack of support for this feature into the main KVM code. Thanks, M. -- Without deviation from the norm, progress is not possible.