> -----Original Message----- > From: Will Deacon <will@xxxxxxxxxx> > Sent: Monday, October 14, 2024 8:29 AM > To: Besar Wicaksono <bwicaksono@xxxxxxxxxx> > Cc: suzuki.poulose@xxxxxxx; robin.murphy@xxxxxxx; > catalin.marinas@xxxxxxx; mark.rutland@xxxxxxx; linux-arm- > kernel@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux- > tegra@xxxxxxxxxxxxxxx; Thierry Reding <treding@xxxxxxxxxx>; Jon Hunter > <jonathanh@xxxxxxxxxx>; Vikram Sethi <vsethi@xxxxxxxxxx>; Rich Wiley > <rwiley@xxxxxxxxxx>; Bob Knight <rknight@xxxxxxxxxx> > Subject: Re: [PATCH 3/3] perf: arm_cspmu: nvidia: enable NVLINK-C2C port > filtering > > External email: Use caution opening links or attachments > > > On Wed, Sep 18, 2024 at 09:58:46PM +0000, Besar Wicaksono wrote: > > Enable NVLINK-C2C port filtering to distinguish traffic from > > different GPUs connected to NVLINK-C2C. > > > > Signed-off-by: Besar Wicaksono <bwicaksono@xxxxxxxxxx> > > --- > > Documentation/admin-guide/perf/nvidia-pmu.rst | 32 > +++++++++++++++++++ > > drivers/perf/arm_cspmu/nvidia_cspmu.c | 7 ++-- > > 2 files changed, 36 insertions(+), 3 deletions(-) > > > > diff --git a/Documentation/admin-guide/perf/nvidia-pmu.rst > b/Documentation/admin-guide/perf/nvidia-pmu.rst > > index 2e0d47cfe7ea..6d1d3206b4ad 100644 > > --- a/Documentation/admin-guide/perf/nvidia-pmu.rst > > +++ b/Documentation/admin-guide/perf/nvidia-pmu.rst > > @@ -86,6 +86,22 @@ Example usage: > > > > perf stat -a -e nvidia_nvlink_c2c0_pmu_3/event=0x0/ > > > > +The NVLink-C2C has two ports that can be connected to one GPU > (occupying both > > +ports) or to two GPUs (one GPU per port). The user can use "port" bitmap > > +parameter to select the port(s) to monitor. Each bit represents the port > number, > > +e.g. "port=0x1" corresponds to port 0 and "port=0x3" is for port 0 and 1. > The > > +PMU will monitor both ports by default if not specified. > > + > > +Example for port filtering: > > + > > +* Count event id 0x0 from the GPU connected with socket 0 on port 0:: > > + > > + perf stat -a -e nvidia_nvlink_c2c0_pmu_0/event=0x0,port=0x1/ > > + > > +* Count event id 0x0 from the GPUs connected with socket 0 on port 0 and > port 1:: > > + > > + perf stat -a -e nvidia_nvlink_c2c0_pmu_0/event=0x0,port=0x3/ > > + > > NVLink-C2C1 PMU > > ------------------- > > > > @@ -116,6 +132,22 @@ Example usage: > > > > perf stat -a -e nvidia_nvlink_c2c1_pmu_3/event=0x0/ > > > > +The NVLink-C2C has two ports that can be connected to one GPU > (occupying both > > +ports) or to two GPUs (one GPU per port). The user can use "port" bitmap > > +parameter to select the port(s) to monitor. Each bit represents the port > number, > > +e.g. "port=0x1" corresponds to port 0 and "port=0x3" is for port 0 and 1. > The > > +PMU will monitor both ports by default if not specified. > > + > > +Example for port filtering: > > + > > +* Count event id 0x0 from the GPU connected with socket 0 on port 0:: > > + > > + perf stat -a -e nvidia_nvlink_c2c1_pmu_0/event=0x0,port=0x1/ > > + > > +* Count event id 0x0 from the GPUs connected with socket 0 on port 0 and > port 1:: > > + > > + perf stat -a -e nvidia_nvlink_c2c1_pmu_0/event=0x0,port=0x3/ > > + > > CNVLink PMU > > --------------- > > > > diff --git a/drivers/perf/arm_cspmu/nvidia_cspmu.c > b/drivers/perf/arm_cspmu/nvidia_cspmu.c > > index d1cd9975e71a..cd51177347e5 100644 > > --- a/drivers/perf/arm_cspmu/nvidia_cspmu.c > > +++ b/drivers/perf/arm_cspmu/nvidia_cspmu.c > > @@ -149,6 +149,7 @@ static struct attribute *pcie_pmu_format_attrs[] = { > > > > static struct attribute *nvlink_c2c_pmu_format_attrs[] = { > > ARM_CSPMU_FORMAT_EVENT_ATTR, > > + ARM_CSPMU_FORMAT_ATTR(port, "config1:0-1"), > > NULL, > > }; > > > > @@ -193,7 +194,7 @@ static u32 nv_cspmu_event_filter(const struct > perf_event *event) > > const struct nv_cspmu_ctx *ctx = > > to_nv_cspmu_ctx(to_arm_cspmu(event->pmu)); > > > > - if (ctx->filter_mask == 0) > > + if (ctx->filter_mask == 0 || event->attr.config1 == 0) > > return ctx->filter_default_val; > > Isn't this a bit too broad? It looks like this filter function is used > beyond the C2C PMU (i.e. the PCIe PMU) and you're also checking the whole > of config1 rather than just the port field. > I think the other PMUs (PCIE and CNVLINK) that have similar filters will also benefit from this change, since a filter value of 0 on these PMUs are meaningless. Should I make the intention clearer by moving this particular change into a separate patch? Thanks, Besar