Patch "ring-buffer: Remove useless update to write_stamp in rb_try_to_discard()" has been added to the 6.6-stable tree

Sasha Levin <sashal@xxxxxxxxxx> · Sun, 17 Dec 2023 19:55:49 -0500

This is a note to let you know that I've just added the patch titled

    ring-buffer: Remove useless update to write_stamp in rb_try_to_discard()

to the 6.6-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     ring-buffer-remove-useless-update-to-write_stamp-in-.patch
and it can be found in the queue-6.6 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit a1b7eb14314efcd9412bbd106e90137bef4b2584
Author: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx>
Date:   Fri Dec 15 08:18:10 2023 -0500

    ring-buffer: Remove useless update to write_stamp in rb_try_to_discard()
    
    [ Upstream commit 083e9f65bd215582bf8f6a920db729fadf16704f ]
    
    When filtering is enabled, a temporary buffer is created to place the
    content of the trace event output so that the filter logic can decide
    from the trace event output if the trace event should be filtered out or
    not. If it is to be filtered out, the content in the temporary buffer is
    simply discarded, otherwise it is written into the trace buffer.
    
    But if an interrupt were to come in while a previous event was using that
    temporary buffer, the event written by the interrupt would actually go
    into the ring buffer itself to prevent corrupting the data on the
    temporary buffer. If the event is to be filtered out, the event in the
    ring buffer is discarded, or if it fails to discard because another event
    were to have already come in, it is turned into padding.
    
    The update to the write_stamp in the rb_try_to_discard() happens after a
    fix was made to force the next event after the discard to use an absolute
    timestamp by setting the before_stamp to zero so it does not match the
    write_stamp (which causes an event to use the absolute timestamp).
    
    But there's an effort in rb_try_to_discard() to put back the write_stamp
    to what it was before the event was added. But this is useless and
    wasteful because nothing is going to be using that write_stamp for
    calculations as it still will not match the before_stamp.
    
    Remove this useless update, and in doing so, we remove another
    cmpxchg64()!
    
    Also update the comments to reflect this change as well as remove some
    extra white space in another comment.
    
    Link: https://lore.kernel.org/linux-trace-kernel/20231215081810.1f4f38fe@xxxxxxxxxxxxxxxxxxxx
    
    Cc: Masami Hiramatsu <mhiramat@xxxxxxxxxx>
    Cc: Mark Rutland <mark.rutland@xxxxxxx>
    Cc: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
    Cc: Joel Fernandes <joel@xxxxxxxxxxxxxxxxx>
    Cc: Vincent Donnefort   <vdonnefort@xxxxxxxxxx>
    Fixes: b2dd797543cf ("ring-buffer: Force absolute timestamp on discard of event")
    Signed-off-by: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index f1ef4329343bf..27066232bc246 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -2981,25 +2981,6 @@ static unsigned rb_calculate_event_length(unsigned length)
 	return length;
 }
 
-static u64 rb_time_delta(struct ring_buffer_event *event)
-{
-	switch (event->type_len) {
-	case RINGBUF_TYPE_PADDING:
-		return 0;
-
-	case RINGBUF_TYPE_TIME_EXTEND:
-		return rb_event_time_stamp(event);
-
-	case RINGBUF_TYPE_TIME_STAMP:
-		return 0;
-
-	case RINGBUF_TYPE_DATA:
-		return event->time_delta;
-	default:
-		return 0;
-	}
-}
-
 static inline bool
 rb_try_to_discard(struct ring_buffer_per_cpu *cpu_buffer,
 		  struct ring_buffer_event *event)
@@ -3007,8 +2988,6 @@ rb_try_to_discard(struct ring_buffer_per_cpu *cpu_buffer,
 	unsigned long new_index, old_index;
 	struct buffer_page *bpage;
 	unsigned long addr;
-	u64 write_stamp;
-	u64 delta;
 
 	new_index = rb_event_index(event);
 	old_index = new_index + rb_event_ts_length(event);
@@ -3017,14 +2996,10 @@ rb_try_to_discard(struct ring_buffer_per_cpu *cpu_buffer,
 
 	bpage = READ_ONCE(cpu_buffer->tail_page);
 
-	delta = rb_time_delta(event);
-
-	if (!rb_time_read(&cpu_buffer->write_stamp, &write_stamp))
-		return false;
-
-	/* Make sure the write stamp is read before testing the location */
-	barrier();
-
+	/*
+	 * Make sure the tail_page is still the same and
+	 * the next write location is the end of this event
+	 */
 	if (bpage->page == (void *)addr && rb_page_write(bpage) == old_index) {
 		unsigned long write_mask =
 			local_read(&bpage->write) & ~RB_WRITE_MASK;
@@ -3035,20 +3010,20 @@ rb_try_to_discard(struct ring_buffer_per_cpu *cpu_buffer,
 		 * to make sure that the next event adds an absolute
 		 * value and does not rely on the saved write stamp, which
 		 * is now going to be bogus.
+		 *
+		 * By setting the before_stamp to zero, the next event
+		 * is not going to use the write_stamp and will instead
+		 * create an absolute timestamp. This means there's no
+		 * reason to update the wirte_stamp!
 		 */
 		rb_time_set(&cpu_buffer->before_stamp, 0);
 
-		/* Something came in, can't discard */
-		if (!rb_time_cmpxchg(&cpu_buffer->write_stamp,
-				       write_stamp, write_stamp - delta))
-			return false;
-
 		/*
 		 * If an event were to come in now, it would see that the
 		 * write_stamp and the before_stamp are different, and assume
 		 * that this event just added itself before updating
 		 * the write stamp. The interrupting event will fix the
-		 * write stamp for us, and use the before stamp as its delta.
+		 * write stamp for us, and use an absolute timestamp.
 		 */
 
 		/*
@@ -3485,7 +3460,7 @@ static void check_buffer(struct ring_buffer_per_cpu *cpu_buffer,
 		return;
 
 	/*
-	 * If this interrupted another event, 
+	 * If this interrupted another event,
 	 */
 	if (atomic_inc_return(this_cpu_ptr(&checking)) != 1)
 		goto out;