Patch "perf: Add sample_flags to indicate the PMU-filled sample data" has been added to the 4.14-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    perf: Add sample_flags to indicate the PMU-filled sample data

to the 4.14-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     perf-add-sample_flags-to-indicate-the-pmu-filled-sam.patch
and it can be found in the queue-4.14 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 6b1ee31694c1ae2e6ce12e4439bc20b6748e95aa
Author: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
Date:   Thu Sep 1 06:09:54 2022 -0700

    perf: Add sample_flags to indicate the PMU-filled sample data
    
    [ Upstream commit 3aac580d5cc3001ca1627725b3b61edb529f341d ]
    
    On some platforms, some data e.g., timestamps, can be retrieved from
    the PMU driver. Usually, the data from the PMU driver is more accurate.
    The current perf kernel should output the PMU-filled sample data if
    it's available.
    
    To check the availability of the PMU-filled sample data, the current
    perf kernel initializes the related fields in the
    perf_sample_data_init(). When outputting a sample, the perf checks
    whether the field is updated by the PMU driver. If yes, the updated
    value will be output. If not, the perf uses an SW way to calculate the
    value or just outputs the initialized value if an SW way is unavailable
    either.
    
    With more and more data being provided by the PMU driver, more fields
    has to be initialized in the perf_sample_data_init(). That will
    increase the number of cache lines touched in perf_sample_data_init()
    and be harmful to the performance.
    
    Add new "sample_flags" to indicate the PMU-filled sample data. The PMU
    driver should set the corresponding PERF_SAMPLE_ flag when the field is
    updated. The initialization of the corresponding field is not required
    anymore. The following patches will make use of it and remove the
    corresponding fields from the perf_sample_data_init(), which will
    further minimize the number of cache lines touched.
    
    Only clear the sample flags that have already been done by the PMU
    driver in the perf_prepare_sample() for the PERF_RECORD_SAMPLE. For the
    other PERF_RECORD_ event type, the sample data is not available.
    
    Suggested-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
    Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
    Link: https://lore.kernel.org/r/20220901130959.1285717-2-kan.liang@xxxxxxxxxxxxxxx
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 41a3307a971c..5efd8109ad0a 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -899,6 +899,7 @@ struct perf_sample_data {
 	 * Fields set by perf_sample_data_init(), group so as to
 	 * minimize the cachelines touched.
 	 */
+	u64				sample_flags;
 	u64				addr;
 	struct perf_raw_record		*raw;
 	struct perf_branch_stack	*br_stack;
@@ -950,6 +951,7 @@ static inline void perf_sample_data_init(struct perf_sample_data *data,
 					 u64 addr, u64 period)
 {
 	/* remaining struct members initialized in perf_prepare_sample() */
+	data->sample_flags = 0;
 	data->addr = addr;
 	data->raw  = NULL;
 	data->br_stack = NULL;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2ad8acff03db..7ad142a5327e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5767,11 +5767,10 @@ perf_output_sample_ustack(struct perf_output_handle *handle, u64 dump_size,
 
 static void __perf_event_header__init_id(struct perf_event_header *header,
 					 struct perf_sample_data *data,
-					 struct perf_event *event)
+					 struct perf_event *event,
+					 u64 sample_type)
 {
-	u64 sample_type = event->attr.sample_type;
-
-	data->type = sample_type;
+	data->type = event->attr.sample_type;
 	header->size += event->id_header_size;
 
 	if (sample_type & PERF_SAMPLE_TID) {
@@ -5800,7 +5799,7 @@ void perf_event_header__init_id(struct perf_event_header *header,
 				struct perf_event *event)
 {
 	if (event->attr.sample_id_all)
-		__perf_event_header__init_id(header, data, event);
+		__perf_event_header__init_id(header, data, event, event->attr.sample_type);
 }
 
 static void __perf_event__output_id_sample(struct perf_output_handle *handle,
@@ -6148,6 +6147,7 @@ void perf_prepare_sample(struct perf_event_header *header,
 			 struct pt_regs *regs)
 {
 	u64 sample_type = event->attr.sample_type;
+	u64 filtered_sample_type;
 
 	header->type = PERF_RECORD_SAMPLE;
 	header->size = sizeof(*header) + event->header_size;
@@ -6155,7 +6155,12 @@ void perf_prepare_sample(struct perf_event_header *header,
 	header->misc = 0;
 	header->misc |= perf_misc_flags(regs);
 
-	__perf_event_header__init_id(header, data, event);
+	/*
+	 * Clear the sample flags that have already been done by the
+	 * PMU driver.
+	 */
+	filtered_sample_type = sample_type & ~data->sample_flags;
+	__perf_event_header__init_id(header, data, event, filtered_sample_type);
 
 	if (sample_type & PERF_SAMPLE_IP)
 		data->ip = perf_instruction_pointer(regs);



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux