On 21/11/2023 10:33, Suzuki K Poulose wrote: > On 20/11/2023 21:31, Namhyung Kim wrote: >> On Mon, Nov 13, 2023 at 3:26 AM James Clark <james.clark@xxxxxxx> wrote: >>> >>> Add documentation for the new Perf event open parameters and >>> the threshold_max capability file. >>> >>> Signed-off-by: James Clark <james.clark@xxxxxxx> >>> --- >>> Documentation/arch/arm64/perf.rst | 56 +++++++++++++++++++++++++++++++ >>> 1 file changed, 56 insertions(+) >>> >>> diff --git a/Documentation/arch/arm64/perf.rst >>> b/Documentation/arch/arm64/perf.rst >>> index 1f87b57c2332..36b8111a710d 100644 >>> --- a/Documentation/arch/arm64/perf.rst >>> +++ b/Documentation/arch/arm64/perf.rst >>> @@ -164,3 +164,59 @@ and should be used to mask the upper bits as >>> needed. >>> >>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c >>> .. _tools/lib/perf/tests/test-evsel.c: >>> >>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c >>> + >>> +Event Counting Threshold >>> +========================================== >>> + >>> +Overview >>> +-------- >>> + >>> +FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on >>> +events whose count meets a specified threshold condition. For >>> example if >>> +threshold_compare is set to 2 ('Greater than or equal'), and the >>> +threshold is set to 2, then the PMU counter will now only increment by >>> +when an event would have previously incremented the PMU counter by 2 or >>> +more on a single processor cycle. >>> + >>> +To increment by 1 after passing the threshold condition instead of the >>> +number of events on that cycle, add the 'threshold_count' option to the >>> +commandline. >>> + >>> +How-to >>> +------ >>> + >>> +The threshold, threshold_compare and threshold_count values can be >>> +provided per event: >>> + >>> +.. code-block:: sh >>> + >>> + perf stat -e stall_slot/threshold=2,threshold_compare=2/ \ >>> + -e >>> dtlb_walk/threshold=10,threshold_compare=3,threshold_count/ >> >> Can you please explain this a bit more? >> >> I guess the first event counts stall_slot PMU if the event if it's >> greater than or equal to 2. And as threshold_count is not set, >> it'd count the stall_slot as is. E.g. it counts 3 when it sees 3. >> >> OTOH, dtlb_walk will count 1 if it sees an event less than 10. >> Is my understanding correct? > > That is correct. The behavior is described in the paragraph above. > But I agree that it would be really helpful if we explained with the > example above. > Yeah I can add a description of how the example behaves. >> >>> + >>> +And the following comparison values are supported: >>> + >>> +.. code-block:: >>> + >>> + 0: Not-equal >>> + 1: Equals >>> + 2: Greater-than-or-equal >>> + 3: Less-than >> >> So the above values are for threashold_compare, right? >> It'd be nice if it's more explicit. Yep I agree, I can label this with threshold_compare. >> >> Similarly, it'd be helpful to have a description for the >> threshold and threshold_count fields. > > Agreed. > > Suzuki > Yeah I'll add explicit descriptions for each field. Thanks for the review. > > >> >> Thanks, >> Namhyung >> >>> + >>> +The maximum supported threshold value can be read from the caps of each >>> +PMU, for example: >>> + >>> +.. code-block:: sh >>> + >>> + cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max >>> + >>> + 0x000000ff >>> + >>> +If a value higher than this is given, then it will be silently clamped >>> +to the maximum. The highest possible maximum is 4095, as the config >>> +field for threshold is limited to 12 bits, and the Perf tool will >>> refuse >>> +to parse higher values. >>> + >>> +If the PMU doesn't support FEAT_PMUv3_TH, then threshold_max will read >>> +0, and both threshold and threshold_compare will be silently ignored. >>> +threshold_max will also read as 0 on aarch32 guests, even if the host >>> +is running on hardware with the feature. >>> -- >>> 2.34.1 >>> >>> > >