On Mon, Oct 30, 2023 at 9:02 AM Heiko Carstens <hca@xxxxxxxxxxxxx> wrote: > > On Fri, Oct 27, 2023 at 11:22:42AM -0400, Neal Gompa wrote: > > On Fri, Oct 27, 2023 at 6:10 AM Heiko Carstens <hca@xxxxxxxxxxxxx> wrote: > > > > > > On Fri, Oct 27, 2023 at 10:00:53AM +0200, Daan De Meyer wrote: > > > > > > > > If the kernel gets support for s390x user space unwinding using the backchain, > > > > we'll propose to enable -mbackchain in the default compilation flags for Fedora > > > > so that s390x on Fedora will have the same profiling experience as x86-64, arm64 > > > > and ppc64. For now we'll keep the status quo since compiling with the backchain > > > > doesn't provide any benefit until the kernel unwinder can unwind user > > > > space stacks > > > > using it. > > > > > > > > Thanks for clarifying the current state of user space stack unwinding on s390x! > > > > > > We will implement the missing pieces and let you know when things are > > > supposed to work. > > > > Do you think we could have an initial patch set for implementing the > > missing pieces in time for the Linux 6.8 merge window? Then we can > > look at enabling this for s390x as a Fedora Linux 40 Change. > > This will be very likely the case. Actually the plan is to go with the > patch below. I gave it some testing with Fedora 38 and replaced (only) > glibc with a variant that was compiled with -mbackchain. > > It seems to work, including the not avoidable problems described below. So > I guess we will try to bring this upstream with the current merge window. > > From 3ba1e255c8af61f172a8b31de2b76d39c7d4c0c7 Mon Sep 17 00:00:00 2001 > From: Heiko Carstens <hca@xxxxxxxxxxxxx> > Date: Mon, 30 Oct 2023 11:34:07 +0100 > Subject: [PATCH] s390/perf: implement perf_callchain_user() > > Daan De Meyer and Neal Gompa reported that s390 does not support perf user > stack unwinding. > > This was never implemented since this requires user space to be compiled > with the -mbackchain compile option, which until now no distribution > did. However this is going to change with Fedora. Therefore provide a > perf_callchain_user() implementation. > > Note that due to the way s390 sets up stack frames the provided call chains > can contain invalid values. This is especially true for the first stack > frame, where it is not possible to tell if the return address has been > written to the stack already or not. > > Reported-by: Daan De Meyer <daan.j.demeyer@xxxxxxxxx> > Reported-by: Neal Gompa <ngompa@xxxxxxxxxxxxxxxxx> > Closes: https://lore.kernel.org/all/CAO8sHcn3+_qrnvp0580aK7jN0Wion5F7KYeBAa4MnCY4mqABPA@xxxxxxxxxxxxxx/ > Signed-off-by: Heiko Carstens <hca@xxxxxxxxxxxxx> > --- > arch/s390/kernel/perf_event.c | 47 +++++++++++++++++++++++++++++++++++ > 1 file changed, 47 insertions(+) > > diff --git a/arch/s390/kernel/perf_event.c b/arch/s390/kernel/perf_event.c > index c27321cb0969..fdbe7e1d9eea 100644 > --- a/arch/s390/kernel/perf_event.c > +++ b/arch/s390/kernel/perf_event.c > @@ -212,6 +212,53 @@ void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, > } > } > > +struct stack_frame_user { > + unsigned long back_chain; > + unsigned long empty1[5]; > + unsigned long gprs[10]; > + unsigned long empty2[4]; > +}; > + > +void perf_callchain_user(struct perf_callchain_entry_ctx *entry, > + struct pt_regs *regs) > +{ > + struct stack_frame_user __user *sf; > + unsigned long ip, sp; > + bool first = true; > + > + if (perf_guest_state()) > + return; > + if (is_compat_task()) > + return; > + perf_callchain_store(entry, instruction_pointer(regs)); > + sf = (void *)user_stack_pointer(regs); > + pagefault_disable(); > + while (entry->nr < entry->max_stack) { > + if (__get_user(sp, &sf->back_chain)) > + break; > + if (__get_user(ip, &sf->gprs[8])) > + break; > + if (ip & 0x1) { > + /* > + * If the instruction address is invalid, and this > + * is the first stack frame, assume r14 has not > + * been written to the stack yet. Otherwise exit. > + */ > + if (first && !(regs->gprs[14] & 0x1)) > + ip = regs->gprs[14]; > + else > + break; > + } > + perf_callchain_store(entry, ip); > + /* Sanity check: ABI requires SP to be aligned 8 bytes. */ > + if (!sp || sp & 0x7) > + break; > + sf = (void __user *)sp; > + first = false; > + } > + pagefault_enable(); > +} > + > /* Perf definitions for PMU event attributes in sysfs */ > ssize_t cpumf_events_sysfs_show(struct device *dev, > struct device_attribute *attr, char *page) > -- > 2.39.2 > This patch LGTM. I'd love to see it land in Linux 6.7! Reviewed-by: Neal Gompa <ngompa@xxxxxxxxxxxxxxxxx> -- Neal Gompa (FAS: ngompa)