On Wed, May 08, 2019 at 05:51:49PM +0100, Sudeep Holla wrote: > On Wed, May 08, 2019 at 05:35:51PM +0800, Hanjun Guo wrote: > > +Cc Alexander. > > > > On 2019/5/8 1:58, Jeremy Linton wrote: > > > Hi, > > > > > > On 5/4/19 6:06 AM, Hanjun Guo wrote: > > >> Hi Jeremy, Mark, > > >> > > >> On 2019/5/4 7:24, Jeremy Linton wrote: > > >>> This patch series enables the Arm Statistical Profiling > > >>> Extension (SPE) on ACPI platforms. > > >>> > > >>> This is possible because ACPI 6.3 uses a previously > > >>> reserved field in the MADT to store the SPE interrupt > > >>> number, similarly to how the normal PMU is described. > > >>> If a consistent valid interrupt exists across all the > > >>> cores in the system, a platform device is registered. > > >>> That then triggers the SPE module, which runs as normal. > > >>> > > >>> We also add the ability to parse the PPTT for IDENTICAL > > >>> cores. We then use this to sanity check the single SPE > > >>> device we create. This creates a bit of a problem with > > >>> respect to the specification though. The specification > > >>> says that its legal for multiple tree's to exist in the > > >>> PPTT. We handle this fine, but what happens in the > > >>> case of multiple tree's is that the lack of a common > > >>> node with IDENTICAL set forces us to assume that there > > >>> are multiple non-IDENTICAL cores in the machine. > > >> > > >> Adding this patch set on top of latest mainline kernel, > > >> and tested on D06 which has the SPE feature, in boot message > > >> shows it was probed successfully: > > >> > > >> arm_spe_pmu arm,spe-v1: probed for CPUs 0-95 [max_record_sz 128, align 4, features 0x7] > > >> > > >> but when I test it with spe events such as > > >> > > >> perf record -c 1024 -e arm_spe_0/branch_filter=0/ -o spe ls > > >> > > >> it fails with: > > >> failed to mmap with 12 (Cannot allocate memory), > > >> > > >> Confirmed that patch [0] is merged and other perf events are working > > >> fine. > > > > > > Its pretty easy to get into the weeds with this driver, does it work with examples like: > > > > > > https://lkml.org/lkml/2018/1/14/122 > > > > No, not work at all. > > > > SPE works on 5.0, but not work after 5.1-rc1, bisected to this commit: > > > > 5768402fd9c6 perf/ring_buffer: Use high order allocations for AUX buffers optimistically > > > > Indeed this patch breaks SPE. As mentioned in the patch, it uses high > order allocations for AUX buffers and SPE PMU setup_aux explicitly > fails with the warning "unexpected high-order page for auxbuf!" if > it encounters one. > > I don't know the intention of that check in SPE. Will ? Since SPE uses virtual addressing, we don't really care about the underlying page layout so there's no need to use higher-order allocations. I suppose we could theoretically map them at the pmd level in some cases, but ignoring them should also be harmless and I suspect you can delete the check. Does the patch below fix the problem? Will --->8 diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c index 7cb766dafe85..e120f933412a 100644 --- a/drivers/perf/arm_spe_pmu.c +++ b/drivers/perf/arm_spe_pmu.c @@ -855,16 +855,8 @@ static void *arm_spe_pmu_setup_aux(struct perf_event *event, void **pages, if (!pglist) goto out_free_buf; - for (i = 0; i < nr_pages; ++i) { - struct page *page = virt_to_page(pages[i]); - - if (PagePrivate(page)) { - pr_warn("unexpected high-order page for auxbuf!"); - goto out_free_pglist; - } - + for (i = 0; i < nr_pages; ++i) pglist[i] = virt_to_page(pages[i]); - } buf->base = vmap(pglist, nr_pages, VM_MAP, PAGE_KERNEL); if (!buf->base)