On Tue, Mar 19, 2024 at 12:11:28PM -0400, Joel Fernandes wrote: > On Tue, Mar 19, 2024 at 12:02 PM Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote: > > > > On Tue, Mar 19, 2024 at 03:48:46PM +0100, Uladzislau Rezki wrote: > > > On Tue, Mar 19, 2024 at 10:29:59AM -0400, Joel Fernandes wrote: > > > > > > > > > > > > > On Mar 19, 2024, at 5:53 AM, Uladzislau Rezki <urezki@xxxxxxxxx> wrote: > > > > > > > > > > On Mon, Mar 18, 2024 at 05:05:31PM -0400, Joel Fernandes wrote: > > > > >> > > > > >> > > > > >>>> On Mar 18, 2024, at 2:58 PM, Uladzislau Rezki <urezki@xxxxxxxxx> wrote: > > > > >>> > > > > >>> Hello, Joel! > > > > >>> > > > > >>> Sorry for late checking, see below few comments: > > > > >>> > > > > >>>> In the synchronize_rcu() common case, we will have less than > > > > >>>> SR_MAX_USERS_WAKE_FROM_GP number of users per GP. Waking up the kworker > > > > >>>> is pointless just to free the last injected wait head since at that point, > > > > >>>> all the users have already been awakened. > > > > >>>> > > > > >>>> Introduce a new counter to track this and prevent the wakeup in the > > > > >>>> common case. > > > > >>>> > > > > >>>> Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx> > > > > >>>> --- > > > > >>>> Rebased on paul/dev of today. > > > > >>>> > > > > >>>> kernel/rcu/tree.c | 36 +++++++++++++++++++++++++++++++----- > > > > >>>> kernel/rcu/tree.h | 1 + > > > > >>>> 2 files changed, 32 insertions(+), 5 deletions(-) > > > > >>>> > > > > >>>> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > > > > >>>> index 9fbb5ab57c84..bd29fe3c76bf 100644 > > > > >>>> --- a/kernel/rcu/tree.c > > > > >>>> +++ b/kernel/rcu/tree.c > > > > >>>> @@ -96,6 +96,7 @@ static struct rcu_state rcu_state = { > > > > >>>> .ofl_lock = __ARCH_SPIN_LOCK_UNLOCKED, > > > > >>>> .srs_cleanup_work = __WORK_INITIALIZER(rcu_state.srs_cleanup_work, > > > > >>>> rcu_sr_normal_gp_cleanup_work), > > > > >>>> + .srs_cleanups_pending = ATOMIC_INIT(0), > > > > >>>> }; > > > > >>>> > > > > >>>> /* Dump rcu_node combining tree at boot to verify correct setup. */ > > > > >>>> @@ -1642,8 +1643,11 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work) > > > > >>>> * the done tail list manipulations are protected here. > > > > >>>> */ > > > > >>>> done = smp_load_acquire(&rcu_state.srs_done_tail); > > > > >>>> - if (!done) > > > > >>>> + if (!done) { > > > > >>>> + /* See comments below. */ > > > > >>>> + atomic_dec_return_release(&rcu_state.srs_cleanups_pending); > > > > >>>> return; > > > > >>>> + } > > > > >>>> > > > > >>>> WARN_ON_ONCE(!rcu_sr_is_wait_head(done)); > > > > >>>> head = done->next; > > > > >>>> @@ -1666,6 +1670,9 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work) > > > > >>>> > > > > >>>> rcu_sr_put_wait_head(rcu); > > > > >>>> } > > > > >>>> + > > > > >>>> + /* Order list manipulations with atomic access. */ > > > > >>>> + atomic_dec_return_release(&rcu_state.srs_cleanups_pending); > > > > >>>> } > > > > >>>> > > > > >>>> /* > > > > >>>> @@ -1673,7 +1680,7 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work) > > > > >>>> */ > > > > >>>> static void rcu_sr_normal_gp_cleanup(void) > > > > >>>> { > > > > >>>> - struct llist_node *wait_tail, *next, *rcu; > > > > >>>> + struct llist_node *wait_tail, *next = NULL, *rcu = NULL; > > > > >>>> int done = 0; > > > > >>>> > > > > >>>> wait_tail = rcu_state.srs_wait_tail; > > > > >>>> @@ -1699,16 +1706,35 @@ static void rcu_sr_normal_gp_cleanup(void) > > > > >>>> break; > > > > >>>> } > > > > >>>> > > > > >>>> - // concurrent sr_normal_gp_cleanup work might observe this update. > > > > >>>> - smp_store_release(&rcu_state.srs_done_tail, wait_tail); > > > > >>>> + /* > > > > >>>> + * Fast path, no more users to process. Remove the last wait head > > > > >>>> + * if no inflight-workers. If there are in-flight workers, let them > > > > >>>> + * remove the last wait head. > > > > >>>> + */ > > > > >>>> + WARN_ON_ONCE(!rcu); > > > > >>>> > > > > >>> This assumption is not correct. An "rcu" can be NULL in fact. > > > > >> > > > > >> Hmm I could never trigger that. Are you saying that is true after Neeraj recent patch or something else? > > > > >> Note, after Neeraj patch to handle the lack of heads availability, it could be true so I requested > > > > >> him to rebase his patch on top of this one. > > > > >> > > > > >> However I will revisit my patch and look for if it could occur but please let me know if you knew of a sequence of events to make it NULL. > > > > >>> > > > > > I think we should agree on your patch first otherwise it becomes a bit > > > > > messy or go with a Neeraj as first step and then work on youth. So, i > > > > > reviewed this patch based on latest Paul's dev branch. I see that Neeraj > > > > > needs further work. > > > > > > > > You are right. So the only change is to drop the warning and those braces. Agreed? > > > > > > > Let me check a bit. Looks like correct but just in case. > > > > > > > Thanks. I was also considering improving it for the rcu == NULL case, as > > below. I will test it more before re-sending. > > > > On top of my patch: > > > > ---8<----------------------- > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > > index 0df659a878ee..a5ef844835d4 100644 > > --- a/kernel/rcu/tree.c > > +++ b/kernel/rcu/tree.c > > @@ -1706,15 +1706,18 @@ static void rcu_sr_normal_gp_cleanup(void) > > break; > > } > > > > + > > + /* Last head stays. No more processing to do. */ > > + if (!rcu) > > + return; > > + > > Ugh, should be "if (!wait_head->next)" instead of "if (!rcu)". But > in any case, the original patch except the warning should hold. > Still, I am testing the above diff now. > > - Joel > Just in case, it is based on your patch: <snip> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index bd29fe3c76bf..98546afe7c21 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1711,29 +1711,25 @@ static void rcu_sr_normal_gp_cleanup(void) * if no inflight-workers. If there are in-flight workers, let them * remove the last wait head. */ - WARN_ON_ONCE(!rcu); - ASSERT_EXCLUSIVE_WRITER(rcu_state.srs_done_tail); - - if (rcu && rcu_sr_is_wait_head(rcu) && rcu->next == NULL && - /* Order atomic access with list manipulation. */ - !atomic_read_acquire(&rcu_state.srs_cleanups_pending)) { + if (wait_tail->next && rcu_sr_is_wait_head(wait_tail->next) && !wait_tail->next->next && + !atomic_read_acquire(&rcu_state.srs_cleanups_pending)) { + rcu_sr_put_wait_head(wait_tail->next); wait_tail->next = NULL; - rcu_sr_put_wait_head(rcu); - smp_store_release(&rcu_state.srs_done_tail, wait_tail); - return; } /* Concurrent sr_normal_gp_cleanup work might observe this update. */ smp_store_release(&rcu_state.srs_done_tail, wait_tail); + ASSERT_EXCLUSIVE_WRITER(rcu_state.srs_done_tail); - /* - * We schedule a work in order to perform a final processing - * of outstanding users(if still left) and releasing wait-heads - * added by rcu_sr_normal_gp_init() call. - */ - atomic_inc(&rcu_state.srs_cleanups_pending); - if (!queue_work(sync_wq, &rcu_state.srs_cleanup_work)) { - atomic_dec(&rcu_state.srs_cleanups_pending); + if (wait_tail->next) { + /* + * We schedule a work in order to perform a final processing + * of outstanding users(if still left) and releasing wait-heads + * added by rcu_sr_normal_gp_init() call. + */ + atomic_inc(&rcu_state.srs_cleanups_pending); + if (!queue_work(sync_wq, &rcu_state.srs_cleanup_work)) + atomic_dec(&rcu_state.srs_cleanups_pending); } } <snip> -- Uladzislau Rezki