Re: [PATCH v3 1/1] tracing: fix bug in rb_per_cpu_empty() that might cause deadloop.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 21 Jul 2021 22:12:07 +0800 Haoran Luo <www@xxxxxxxxxxxxxx> wrote:
 > The "rb_per_cpu_empty()" misinterpret the condition (as not-empty) when 
 > "head_page" and "commit_page" of "struct ring_buffer_per_cpu" points to 
 > the same buffer page, whose "buffer_data_page" is empty and "read" field 
 > is non-zero. 
 >  
 > An error scenario could be constructed as followed (kernel perspective): 
 >  
 > 1. All pages in the buffer has been accessed by reader(s) so that all of 
 > them will have non-zero "read" field. 
 >  
 > 2. Read and clear all buffer pages so that "rb_num_of_entries()" will 
 > return 0 rendering there's no more data to read. It is also required 
 > that the "read_page", "commit_page" and "tail_page" points to the same 
 > page, while "head_page" is the next page of them. 
 >  
 > 3. Invoke "ring_buffer_lock_reserve()" with large enough "length" 
 > so that it shot pass the end of current tail buffer page. Now the 
 > "head_page", "commit_page" and "tail_page" points to the same page. 
 >  
 > 4. Discard current event with "ring_buffer_discard_commit()", so that 
 > "head_page", "commit_page" and "tail_page" points to a page whose buffer 
 > data page is now empty. 
 >  
 > When the error scenario has been constructed, "tracing_read_pipe" will 
 > be trapped inside a deadloop: "trace_empty()" returns 0 since 
 > "rb_per_cpu_empty()" returns 0 when it hits the CPU containing such 
 > constructed ring buffer. Then "trace_find_next_entry_inc()" always 
 > return NULL since "rb_num_of_entries()" reports there's no more entry 
 > to read. Finally "trace_seq_to_user()" returns "-EBUSY" spanking 
 > "tracing_read_pipe" back to the start of the "waitagain" loop. 
 >  
 > I've also written a proof-of-concept script to construct the scenario 
 > and trigger the bug automatically, you can use it to trace and validate 
 > my reasoning above: 
 >  
 >  https://github.com/aegistudio/RingBufferDetonator.git 
 >  
 > Tests has been carried out on linux kernel 5.14-rc2 
 > (2734d6c1b1a089fb593ef6a23d4b70903526fe0c), my fixed version 
 > of kernel (for testing whether my update fixes the bug) and 
 > some older kernels (for range of affected kernels). Test result is 
 > also attached to the proof-of-concept repository. 
 >  
 > Signed-off-by: Haoran Luo <www@xxxxxxxxxxxxxx> 
 > --- 
 >  kernel/trace/ring_buffer.c | 28 ++++++++++++++++++++++++---- 
 >  1 file changed, 24 insertions(+), 4 deletions(-) 
 >  
 > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c 
 > index d1463eac11a3..e592d1df6f88 100644 
 > --- a/kernel/trace/ring_buffer.c 
 > +++ b/kernel/trace/ring_buffer.c 
 > @@ -3880,10 +3880,30 @@ static bool rb_per_cpu_empty(struct ring_buffer_per_cpu *cpu_buffer) 
 >      if (unlikely(!head)) 
 >          return true; 
 >  
 > -    return reader->read == rb_page_commit(reader) && 
 > -        (commit == reader || 
 > -         (commit == head && 
 > -          head->read == rb_page_commit(commit))); 
 > +    /* Reader should exhaust content in reader page */ 
 > +    if (reader->read != rb_page_commit(reader)) 
 > +        return false; 
 > + 
 > +    /* 
 > +     * If writers are committing on the reader page, knowing all 
 > +     * committed content has been read, the ring buffer is empty. 
 > +     */ 
 > +    if (commit == reader) 
 > +        return true; 
 > + 
 > +    /* 
 > +     * If writers are committing on a page other than reader page 
 > +     * and head page, there should always be content to read. 
 > +     */ 
 > +    if (commit != head) 
 > +        return false; 
 > + 
 > +    /* 
 > +     * Writers are committing on the head page, we just need 
 > +     * to care about there're committed data, and the reader will 
 > +     * swap reader page with head page when it is to read data. 
 > +     */ 
 > +    return rb_page_commit(commit) == 0; 
 >  } 
 >  
 >  /** 
 > -- 
 > 2.30.2 
 >  
 > 

(It's embarrassing to be Mr.Contact due to my wrong configuration in
my email hosting when I replies, and I've corrected now.)

Allow me to explain some of my comprehension and thought.

I've submitted the initial version of this bugfix and got some feedback from
Torvalds and Steven. We've done some discussion from which I've made
up my rationale for this patch:

1. Steven has pointed out that this bug is caused by not clearing "read"
field of a buffer page after releasing it back to the writers' ring buffer. This
will cause "rb_per_cpu_empty()" to read a stale value of "read" (in the writers'
buffer) and cause a bug. Steven also offers a patch clearing out the "read"
field to zero.

2. Steven also points out that "rb_page_commit(commit) == head->read"
conditional might also produces false positive result when the newly
committed event is just the same size as "head->read".

3. The "rb_get_reader_page()" is responsible for swapping reader page and
with head page. Before putting the reader page back to the ring buffer for
writers to write, the reader will clear out writer related fields, including the
commit of buffer data page.

4. The "rb_get_reader_page()" fetches a newer reader page and clears the
"read" field when more data is available.

5. Performing #1 ensures "read" field to be zeroed when a buffer page
becomes buffer page again, this should also eliminate false positive
mentioned in #2. So I think #1 should be a working fix.

6. However, setting "read" to zero is not necessary. Based on #3, atomic
"commit" field which is fetched by "rb_page_commit()" should be a working
indicator for buffer pages' availability to read. Based on #4, (I think) the reader
should only care about "read" field of an obtained reader page, not a page
inside the writer buffer.

7. Based on #6, when the reader has exhausted data in the reader page, and
writers are currently committing on head page, "rb_per_cpu_empty()" should
only care about whether there's content on head page. And comparing
"rb_page_commit(commit) == 0" should do the work. The conditional
"rb_page_commit(commit) == head->read" requiring invariable that "read" is
0 when it is placed back to writers' buffer seems to create a coupling between
"head->read" and "0" for me, and the ring buffer should perform an extra
clear (when placing back buffer page) and read (when fetching "head->read"
in "rb_per_cpu_empty()"). I think just fetching "rb_page_commit(commit)"
and compare it with immediate "0" comes more direct for me.

The core concept of mine is #7 and all other items above are for who subscribes
to the "linux-trace-devel" mail list but not so understand what is happening in
ring buffer algorithm. The bugfix shouldn't be encrypted protocol establish between
Torvalds, Steven and me, but explained to and accepted by community. (Again,
this is my personal opinion, nothing canonical and might be completely wrong,
so accept or reject with your own judgement.)

I owes Steven in the progression of the bugfix: he points out that comparing
"head->read == rb_page_commit(commit)" is unnecessary when
"rb_page_commit(commit) == 0" is present, and confirms some details of ring
buffer with me. He also gives me some hints and advise for commenting and
formatting the code. Without him, I can't write a reasonable bugfix or form
the rationale above.

Pardon me again for my poor English, and thank you for reading such verbose
explanation of mine.

Haoran



[Index of Archives]     [Linux USB Development]     [Linux USB Development]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux