On 3/19/2024 1:26 PM, Uladzislau Rezki wrote: > On Tue, Mar 19, 2024 at 12:11:28PM -0400, Joel Fernandes wrote: >> On Tue, Mar 19, 2024 at 12:02 PM Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote: >>> >>> On Tue, Mar 19, 2024 at 03:48:46PM +0100, Uladzislau Rezki wrote: >>>> On Tue, Mar 19, 2024 at 10:29:59AM -0400, Joel Fernandes wrote: >>>>> >>>>> >>>>>> On Mar 19, 2024, at 5:53 AM, Uladzislau Rezki <urezki@xxxxxxxxx> wrote: >>>>>> >>>>>> On Mon, Mar 18, 2024 at 05:05:31PM -0400, Joel Fernandes wrote: >>>>>>> >>>>>>> >>>>>>>>> On Mar 18, 2024, at 2:58 PM, Uladzislau Rezki <urezki@xxxxxxxxx> wrote: >>>>>>>> >>>>>>>> Hello, Joel! >>>>>>>> >>>>>>>> Sorry for late checking, see below few comments: >>>>>>>> >>>>>>>>> In the synchronize_rcu() common case, we will have less than >>>>>>>>> SR_MAX_USERS_WAKE_FROM_GP number of users per GP. Waking up the kworker >>>>>>>>> is pointless just to free the last injected wait head since at that point, >>>>>>>>> all the users have already been awakened. >>>>>>>>> >>>>>>>>> Introduce a new counter to track this and prevent the wakeup in the >>>>>>>>> common case. >>>>>>>>> >>>>>>>>> Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx> >>>>>>>>> --- >>>>>>>>> Rebased on paul/dev of today. >>>>>>>>> >>>>>>>>> kernel/rcu/tree.c | 36 +++++++++++++++++++++++++++++++----- >>>>>>>>> kernel/rcu/tree.h | 1 + >>>>>>>>> 2 files changed, 32 insertions(+), 5 deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c >>>>>>>>> index 9fbb5ab57c84..bd29fe3c76bf 100644 >>>>>>>>> --- a/kernel/rcu/tree.c >>>>>>>>> +++ b/kernel/rcu/tree.c >>>>>>>>> @@ -96,6 +96,7 @@ static struct rcu_state rcu_state = { >>>>>>>>> .ofl_lock = __ARCH_SPIN_LOCK_UNLOCKED, >>>>>>>>> .srs_cleanup_work = __WORK_INITIALIZER(rcu_state.srs_cleanup_work, >>>>>>>>> rcu_sr_normal_gp_cleanup_work), >>>>>>>>> + .srs_cleanups_pending = ATOMIC_INIT(0), >>>>>>>>> }; >>>>>>>>> >>>>>>>>> /* Dump rcu_node combining tree at boot to verify correct setup. */ >>>>>>>>> @@ -1642,8 +1643,11 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work) >>>>>>>>> * the done tail list manipulations are protected here. >>>>>>>>> */ >>>>>>>>> done = smp_load_acquire(&rcu_state.srs_done_tail); >>>>>>>>> - if (!done) >>>>>>>>> + if (!done) { >>>>>>>>> + /* See comments below. */ >>>>>>>>> + atomic_dec_return_release(&rcu_state.srs_cleanups_pending); >>>>>>>>> return; >>>>>>>>> + } >>>>>>>>> >>>>>>>>> WARN_ON_ONCE(!rcu_sr_is_wait_head(done)); >>>>>>>>> head = done->next; >>>>>>>>> @@ -1666,6 +1670,9 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work) >>>>>>>>> >>>>>>>>> rcu_sr_put_wait_head(rcu); >>>>>>>>> } >>>>>>>>> + >>>>>>>>> + /* Order list manipulations with atomic access. */ >>>>>>>>> + atomic_dec_return_release(&rcu_state.srs_cleanups_pending); >>>>>>>>> } >>>>>>>>> >>>>>>>>> /* >>>>>>>>> @@ -1673,7 +1680,7 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work) >>>>>>>>> */ >>>>>>>>> static void rcu_sr_normal_gp_cleanup(void) >>>>>>>>> { >>>>>>>>> - struct llist_node *wait_tail, *next, *rcu; >>>>>>>>> + struct llist_node *wait_tail, *next = NULL, *rcu = NULL; >>>>>>>>> int done = 0; >>>>>>>>> >>>>>>>>> wait_tail = rcu_state.srs_wait_tail; >>>>>>>>> @@ -1699,16 +1706,35 @@ static void rcu_sr_normal_gp_cleanup(void) >>>>>>>>> break; >>>>>>>>> } >>>>>>>>> >>>>>>>>> - // concurrent sr_normal_gp_cleanup work might observe this update. >>>>>>>>> - smp_store_release(&rcu_state.srs_done_tail, wait_tail); >>>>>>>>> + /* >>>>>>>>> + * Fast path, no more users to process. Remove the last wait head >>>>>>>>> + * if no inflight-workers. If there are in-flight workers, let them >>>>>>>>> + * remove the last wait head. >>>>>>>>> + */ >>>>>>>>> + WARN_ON_ONCE(!rcu); >>>>>>>>> >>>>>>>> This assumption is not correct. An "rcu" can be NULL in fact. >>>>>>> >>>>>>> Hmm I could never trigger that. Are you saying that is true after Neeraj recent patch or something else? >>>>>>> Note, after Neeraj patch to handle the lack of heads availability, it could be true so I requested >>>>>>> him to rebase his patch on top of this one. >>>>>>> >>>>>>> However I will revisit my patch and look for if it could occur but please let me know if you knew of a sequence of events to make it NULL. >>>>>>>> >>>>>> I think we should agree on your patch first otherwise it becomes a bit >>>>>> messy or go with a Neeraj as first step and then work on youth. So, i >>>>>> reviewed this patch based on latest Paul's dev branch. I see that Neeraj >>>>>> needs further work. >>>>> >>>>> You are right. So the only change is to drop the warning and those braces. Agreed? >>>>> >>>> Let me check a bit. Looks like correct but just in case. >>>> >>> >>> Thanks. I was also considering improving it for the rcu == NULL case, as >>> below. I will test it more before re-sending. >>> >>> On top of my patch: >>> >>> ---8<----------------------- >>> >>> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c >>> index 0df659a878ee..a5ef844835d4 100644 >>> --- a/kernel/rcu/tree.c >>> +++ b/kernel/rcu/tree.c >>> @@ -1706,15 +1706,18 @@ static void rcu_sr_normal_gp_cleanup(void) >>> break; >>> } >>> >>> + >>> + /* Last head stays. No more processing to do. */ >>> + if (!rcu) >>> + return; >>> + >> >> Ugh, should be "if (!wait_head->next)" instead of "if (!rcu)". But >> in any case, the original patch except the warning should hold. >> Still, I am testing the above diff now. >> >> - Joel >> > Just in case, it is based on your patch: > > <snip> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index bd29fe3c76bf..98546afe7c21 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -1711,29 +1711,25 @@ static void rcu_sr_normal_gp_cleanup(void) > * if no inflight-workers. If there are in-flight workers, let them > * remove the last wait head. > */ > - WARN_ON_ONCE(!rcu); > - ASSERT_EXCLUSIVE_WRITER(rcu_state.srs_done_tail); > - > - if (rcu && rcu_sr_is_wait_head(rcu) && rcu->next == NULL && > - /* Order atomic access with list manipulation. */ > - !atomic_read_acquire(&rcu_state.srs_cleanups_pending)) { > + if (wait_tail->next && rcu_sr_is_wait_head(wait_tail->next) && !wait_tail->next->next && > + !atomic_read_acquire(&rcu_state.srs_cleanups_pending)) { Yes this also works. But also if wait_tail->next == NULL, then you do not need to queue worker for that case as well. I sent this as v3. If you want to add that and resend my patch with the above diff, that would also be fine. Or I can do that, let me know. Thanks! - Joel