On Mon, Mar 04, 2024 at 03:13:10PM -0500, Joel Fernandes wrote: > > > On 3/4/2024 2:44 PM, Paul E. McKenney wrote: > > On Mon, Mar 04, 2024 at 02:10:09PM -0500, Joel Fernandes wrote: > >> > >> > >> On 3/4/2024 12:14 PM, Paul E. McKenney wrote: > >>> On Mon, Mar 04, 2024 at 11:19:21AM -0500, Joel Fernandes wrote: > >>>> > >>>> > >>>> On 3/4/2024 5:54 AM, linke li wrote: > >>>>> Some changes are done to fix a data race in commit 202489101f2e ("rcutorture: Fix rcu_torture_one_read()/rcu_torture_writer() data race") > >>>>> > >>>>> { > >>>>> int i; > >>>>> > >>>>> - i = rp->rtort_pipe_count; > >>>>> + i = READ_ONCE(rp->rtort_pipe_count); > >>>>> if (i > RCU_TORTURE_PIPE_LEN) > >>>>> i = RCU_TORTURE_PIPE_LEN; > >>>>> atomic_inc(&rcu_torture_wcount[i]); > >>>>> - if (++rp->rtort_pipe_count >= RCU_TORTURE_PIPE_LEN) { > >>>>> + WRITE_ONCE(rp->rtort_pipe_count, i + 1); > >>>>> + if (rp->rtort_pipe_count >= RCU_TORTURE_PIPE_LEN) { > >>>>> rp->rtort_mbtest = 0; > >>>>> return true; > >>>>> } > >>>>> > >>>>> But ++rp->rtort_pipe_count is meant to add itself by 1, not give i+1 to > >>>>> rp->rtort_pipe_count, because rp->rtort_pipe_count may write by > >>>>> rcu_torture_writer() concurrently. > >>>>> > >>>>> Also, rp->rtort_pipe_count in the next line should be read using > >>>>> READ_ONCE() because of data race. > >>>>> > >>>>> Signed-off-by: linke li <lilinke99@xxxxxx> > >>>>> --- > >>>>> kernel/rcu/rcutorture.c | 4 ++-- > >>>>> 1 file changed, 2 insertions(+), 2 deletions(-) > >>>>> > >>>>> diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c > >>>>> index 7567ca8e743c..00059ace4fd5 100644 > >>>>> --- a/kernel/rcu/rcutorture.c > >>>>> +++ b/kernel/rcu/rcutorture.c > >>>>> @@ -465,8 +465,8 @@ rcu_torture_pipe_update_one(struct rcu_torture *rp) > >>>>> if (i > RCU_TORTURE_PIPE_LEN) > >>>>> i = RCU_TORTURE_PIPE_LEN; > >>>>> atomic_inc(&rcu_torture_wcount[i]); > >>>>> - WRITE_ONCE(rp->rtort_pipe_count, i + 1); > >>>>> - if (rp->rtort_pipe_count >= RCU_TORTURE_PIPE_LEN) { > >>>>> + WRITE_ONCE(rp->rtort_pipe_count, rp->rtort_pipe_count + 1); > >>>>> + if (READ_ONCE(rp->rtort_pipe_count) >= RCU_TORTURE_PIPE_LEN) { > >>>> > >>>> I want to say, I am not convinced with the patch because what's wrong with > >>>> writing to an old index? > >>>> > >>>> You win/lose the race anyway, say the CPU executed the WRITE_ONCE() a bit too > >>>> early/late and another WRITE_ONCE() lost/won, regardless of whether you wrote > >>>> the "incremented i" or "the increment from the latest value of pipe_count". > >>>> > >>>> Anyway, a slightly related/different question: > >>>> > >>>> Should that: > >>>> WRITE_ONCE(rp->rtort_pipe_count, rp->rtort_pipe_count + 1); > >>>> > >>>> Be: > >>>> WRITE_ONCE(rp->rtort_pipe_count, READ_ONCE(rp->rtort_pipe_count) + 1); > >>>> > >>>> ? > >>> > >>> Thank you both! > >>> > >>> At first glance, I would argue for something like this: > >>> > >>> ------------------------------------------------------------------------ > >>> > >>> static bool > >>> rcu_torture_pipe_update_one(struct rcu_torture *rp) > >>> { > >>> int i; > >>> struct rcu_torture_reader_check *rtrcp = READ_ONCE(rp->rtort_chkp); > >>> > >>> if (rtrcp) { > >>> WRITE_ONCE(rp->rtort_chkp, NULL); > >>> smp_store_release(&rtrcp->rtc_ready, 1); // Pair with smp_load_acquire(). > >>> } > >>> i = READ_ONCE(rp->rtort_pipe_count) + 1; > >>> if (i > RCU_TORTURE_PIPE_LEN) > >>> i = RCU_TORTURE_PIPE_LEN; > >>> atomic_inc(&rcu_torture_wcount[i]); > >>> WRITE_ONCE(rp->rtort_pipe_count, i); > >>> if (i >= RCU_TORTURE_PIPE_LEN) { > >>> rp->rtort_mbtest = 0; > >>> return true; > >>> } > >>> return false; > >>> } > >>> > >>> ------------------------------------------------------------------------ > >>> > >>> That is, move the increment to the read and replace the re-read with > >>> the value "i" that was just written. > >> > >> But that changes the original logic as well? It looks like with the above > >> change, you're now writing to rcu_torture_wcount[READ_ONCE(rp->rtort_pipe_count) > >> + 1] instead of rcu_torture_wcount[READ_ONCE(rp->rtort_pipe_count)]. > >> > >> I think that might break rcutorture, because there is an increment outside of > >> the first 2 entries in rcu_torture_wcount but not sure (need to look more). > > > > Good point on never incrementing the zeroth entry! Clearly I should > > have waited before replying. > > > > How about the following? > > > > ------------------------------------------------------------------------ > > > > static bool > > rcu_torture_pipe_update_one(struct rcu_torture *rp) > > { > > int i; > > struct rcu_torture_reader_check *rtrcp = READ_ONCE(rp->rtort_chkp); > > > > if (rtrcp) { > > WRITE_ONCE(rp->rtort_chkp, NULL); > > smp_store_release(&rtrcp->rtc_ready, 1); // Pair with smp_load_acquire(). > > } > > i = READ_ONCE(rp->rtort_pipe_count); > > if (i > RCU_TORTURE_PIPE_LEN) > > i = RCU_TORTURE_PIPE_LEN; > > atomic_inc(&rcu_torture_wcount[i]); > > WRITE_ONCE(rp->rtort_pipe_count, i + 1); > > if (i + 1 >= RCU_TORTURE_PIPE_LEN) { > > rp->rtort_mbtest = 0; > > return true; > > } > > return false; > > } > > Yes, this looks good to me. Thanks, > Reviewed-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx> Again, thank you. linke li, does this approach work for you? If so, would you be willing to send a new patch along these lines? If it does not work, what additional problems do you see? Thanx, Paul