[PATCH v2 0/5] perf: KVM: Enable callchains for guests

Marc Zyngier maz at kernel.org
Wed Oct 11 09:45:29 PDT 2023


On Sun, 08 Oct 2023 15:48:17 +0100,
Tianyi Liu <i.pear at outlook.com> wrote:
> 
> Hi there,
> 
> This series of patches enables callchains for guests (used by perf kvm),
> which holds the top spot on the perf wiki TODO list [1]. This allows users
> to perform guest OS callchain or performance analysis from external
> using PMU events.
> 
> The event processing flow is as follows (shown as backtrace):
>   #0 kvm_arch_vcpu_get_frame_pointer / kvm_arch_vcpu_read_virt (per arch)
>   #1 kvm_guest_get_frame_pointer / kvm_guest_read_virt
>      <callback function pointers in `struct perf_guest_info_callbacks`>
>   #2 perf_guest_get_frame_pointer / perf_guest_read_virt
>   #3 perf_callchain_guest
>   #4 get_perf_callchain
>   #5 perf_callchain
> 
> Between #0 and #1 is the interface between KVM and the arch-specific
> impl, while between #1 and #2 is the interface between Perf and KVM.
> The 1st patch implements #0. The 2nd patch extends interfaces between #1
> and #2, while the 3rd patch implements #1. The 4th patch implements #3
> and modifies #4 #5. The last patch is for userspace utils.
> 
> Since arm64 hasn't provided some foundational infrastructure (interface
> for reading from a virtual address of guest), the arm64 implementation
> is stubbed for now because it's a bit complex, and will be implemented
> later.

I hope you realise that such an "interface" would be, by definition,
fragile and very likely to break in a subtle way. The only existing
case where we walk the guest's page tables is for NV, and even that is
extremely fragile.

Given that, I really wonder why this needs to happen in the kernel.
Userspace has all the required information to interrupt a vcpu and
walk its current context, without any additional kernel support. What
are the bits here that cannot be implemented anywhere else?

> 
> Tested with both 32-bit and 64-bit guest operating systems / unikernels,
> that `perf script` could correctly show the certain callchains.
> FlameGraphs can also be generated with this series of patches and [2].
> 
> Any feedback will be greatly appreciated.
> 
> [1] https://perf.wiki.kernel.org/index.php/Todo
> [2] https://github.com/brendangregg/FlameGraph
> 
> v1:
> https://lore.kernel.org/kvm/SYYP282MB108686A73C0F896D90D246569DE5A@SYYP282MB1086.AUSP282.PROD.OUTLOOK.COM/
> 
> Changes since v1:
> - v1 only includes partial KVM modifications, while v2 is a complete
> implementation. Also updated based on Sean's feedback.
> 
> Tianyi Liu (5):
>   KVM: Add arch specific interfaces for sampling guest callchains
>   perf kvm: Introduce guest interfaces for sampling callchains
>   KVM: implement new perf interfaces
>   perf kvm: Support sampling guest callchains
>   perf tools: Support PERF_CONTEXT_GUEST_* flags
> 
>  arch/arm64/kvm/arm.c                | 17 +++++++++

Given that there is more to KVM than just arm64 and x86, I suggest
that you move the lack of support for this feature into the main KVM
code.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.



More information about the linux-arm-kernel mailing list