> On Mon, Jan 29, 2024 at 05:51:38PM +0800, Ji Sheng Teoh wrote: > > This patch adds support for StarFive's StarLink PMU (Performance > > Monitor Unit). StarLink PMU integrates one or more CPU cores with a > > shared L3 memory system. The PMU supports overflow interrupt, up to 16 > > programmable 64bit event counters, and an independent 64bit cycle > > counter. StarLink PMU is accessed via MMIO. > > Since Palmer acked this (thanks!), I queued it locally but then ran into a few small issues with my build testing. Comments below. > > > diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig index > > 273d67ecf6d2..41278742ef88 100644 > > --- a/drivers/perf/Kconfig > > +++ b/drivers/perf/Kconfig > > @@ -86,6 +86,15 @@ config RISCV_PMU_SBI > > full perf feature support i.e. counter overflow, privilege mode > > filtering, counter configuration. > > > > +config STARFIVE_STARLINK_PMU > > + depends on ARCH_STARFIVE > > Please can you add "|| COMPILE_TEST" to this dependency so that you get build coverage from other architectures? > Sure, will add it in the next revision. > > + bool "StarFive StarLink PMU" > > + help > > + Provide support for StarLink Performance Monitor Unit. > > + StarLink Performance Monitor Unit integrates one or more cores with > > + an L3 memory system. The L3 cache events are added into perf event > > + subsystem, allowing monitoring of various L3 cache perf events. > > + > > config ARM_PMU_ACPI > > depends on ARM_PMU && ACPI > > def_bool y > > [...] > > > diff --git a/drivers/perf/starfive_starlink_pmu.c > > b/drivers/perf/starfive_starlink_pmu.c > > new file mode 100644 > > index 000000000000..2447ca09a471 > > --- /dev/null > > +++ b/drivers/perf/starfive_starlink_pmu.c > > @@ -0,0 +1,643 @@ > > +// SPDX-License-Identifier: GPL-2.0-only > > +/* > > + * StarFive's StarLink PMU driver > > + * > > + * Copyright (C) 2023 StarFive Technology Co., Ltd. > > + * > > + * Author: Ji Sheng Teoh <jisheng.teoh@xxxxxxxxxxxxxxxx> > > + * > > + */ > > [...] > > > +static void starlink_pmu_counter_start(struct perf_event *event, > > + struct starlink_pmu *starlink_pmu) { > > + struct hw_perf_event *hwc = &event->hw; > > + int idx = event->hw.idx; > > + u64 val; > > + > > + /* > > + * Enable counter overflow interrupt[63:0], > > + * which is mapped as follow: > > + * > > + * event counter 0 - Bit [0] > > + * event counter 1 - Bit [1] > > + * ... > > + * cycle counter - Bit [63] > > + */ > > + val = readq(starlink_pmu->pmu_base + STARLINK_PMU_INTERRUPT_ENABLE); > > + > > + if (hwc->config == STARLINK_CYCLES) { > > + /* > > + * Cycle count has its dedicated register, and it starts > > + * counting as soon as STARLINK_PMU_GLOBAL_ENABLE is set. > > + */ > > + val |= STARLINK_PMU_CYCLE_OVERFLOW_MASK; > > + } else { > > + writeq(event->hw.config, starlink_pmu->pmu_base + > > + STARLINK_PMU_EVENT_SELECT + idx * sizeof(u64)); > > + > > + val |= (1 << idx); > > + } > > I think this needs to be a u64 on the right hand side, or just use the > BIT_ULL() macro. > Ahh ok, will just append it with BIT_ULL() macro. > > + > > + writeq(val, starlink_pmu->pmu_base + STARLINK_PMU_INTERRUPT_ENABLE); > > + > > + writeq(STARLINK_PMU_GLOBAL_ENABLE, starlink_pmu->pmu_base + > > + STARLINK_PMU_CONTROL); > > +} > > [...] > > > +static irqreturn_t starlink_pmu_handle_irq(int irq_num, void *data) { > > + struct starlink_pmu *starlink_pmu = data; > > + struct starlink_hw_events *hw_events = > > + this_cpu_ptr(starlink_pmu->hw_events); > > + bool handled = false; > > + int idx; > > + u64 overflow_status; > > + > > + for (idx = 0; idx < STARLINK_PMU_MAX_COUNTERS; idx++) { > > + struct perf_event *event = hw_events->events[idx]; > > + > > + if (!event) > > + continue; > > + > > + overflow_status = readq(starlink_pmu->pmu_base + > > + STARLINK_PMU_COUNTER_OVERFLOW_STATUS); > > + if (!(overflow_status & BIT(idx))) > > + continue; > > + > > + writeq(1 << idx, starlink_pmu->pmu_base + > > + STARLINK_PMU_COUNTER_OVERFLOW_STATUS); > > Same shifting problem here. > Got it. > > +static int starlink_pmu_probe(struct platform_device *pdev) { > > + struct starlink_pmu *starlink_pmu; > > + struct starlink_hw_events *hw_events; > > + struct resource *res; > > + int cpuid, i, ret; > > + > > + starlink_pmu = devm_kzalloc(&pdev->dev, sizeof(*starlink_pmu), GFP_KERNEL); > > + if (!starlink_pmu) > > + return -ENOMEM; > > + > > + starlink_pmu->pmu_base = > > + devm_platform_get_and_ioremap_resource(pdev, 0, &res); > > + if (IS_ERR(starlink_pmu->pmu_base)) > > + return PTR_ERR(starlink_pmu->pmu_base); > > + > > + starlink_pmu->hw_events = alloc_percpu_gfp(struct starlink_hw_events, > > + GFP_KERNEL); > > + if (!starlink_pmu->hw_events) { > > + dev_err(&pdev->dev, "Failed to allocate per-cpu PMU data\n"); > > + kfree(starlink_pmu); > > You shouldn't call kfree() on a device-managed object (i.e. allocated with devm_kzalloc()). > You are right, I will drop it. Thanks for the review Will. JiSheng