On Fri, Jul 08, 2022 at 06:43:21PM +0000, Joel Fernandes wrote: > On Sat, Jun 25, 2022 at 09:00:19PM -0700, Paul E. McKenney wrote: > > On Wed, Jun 22, 2022 at 10:50:55PM +0000, Joel Fernandes (Google) wrote: > > > Implement timer-based RCU lazy callback batching. The batch is flushed > > > whenever a certain amount of time has passed, or the batch on a > > > particular CPU grows too big. Also memory pressure will flush it in a > > > future patch. > > > > > > To handle several corner cases automagically (such as rcu_barrier() and > > > hotplug), we re-use bypass lists to handle lazy CBs. The bypass list > > > length has the lazy CB length included in it. A separate lazy CB length > > > counter is also introduced to keep track of the number of lazy CBs. > > > > > > Suggested-by: Paul McKenney <paulmck@xxxxxxxxxx> > > > Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx> > > > > Not bad, but some questions and comments below. > > Thanks a lot for these, real helpful and I replied below: > > > > diff --git a/include/linux/rcu_segcblist.h b/include/linux/rcu_segcblist.h > > > index 659d13a7ddaa..9a992707917b 100644 > > > --- a/include/linux/rcu_segcblist.h > > > +++ b/include/linux/rcu_segcblist.h > > > @@ -22,6 +22,7 @@ struct rcu_cblist { > > > struct rcu_head *head; > > > struct rcu_head **tail; > > > long len; > > > + long lazy_len; > > > }; > > > > > > #define RCU_CBLIST_INITIALIZER(n) { .head = NULL, .tail = &n.head } > > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h > > > index 1a32036c918c..9191a3d88087 100644 > > > --- a/include/linux/rcupdate.h > > > +++ b/include/linux/rcupdate.h > > > @@ -82,6 +82,12 @@ static inline int rcu_preempt_depth(void) > > > > > > #endif /* #else #ifdef CONFIG_PREEMPT_RCU */ > > > > > > +#ifdef CONFIG_RCU_LAZY > > > +void call_rcu_lazy(struct rcu_head *head, rcu_callback_t func); > > > +#else > > > +#define call_rcu_lazy(head, func) call_rcu(head, func) > > > +#endif > > > + > > > /* Internal to kernel */ > > > void rcu_init(void); > > > extern int rcu_scheduler_active; > > > diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig > > > index 27aab870ae4c..0bffa992fdc4 100644 > > > --- a/kernel/rcu/Kconfig > > > +++ b/kernel/rcu/Kconfig > > > @@ -293,4 +293,12 @@ config TASKS_TRACE_RCU_READ_MB > > > Say N here if you hate read-side memory barriers. > > > Take the default if you are unsure. > > > > > > +config RCU_LAZY > > > + bool "RCU callback lazy invocation functionality" > > > + depends on RCU_NOCB_CPU > > > + default n > > > + help > > > + To save power, batch RCU callbacks and flush after delay, memory > > > + pressure or callback list growing too big. > > > > Spaces vs. tabs. > > Fixed, thanks. > > > The checkpatch warning is unhelpful ("please write a help paragraph that > > fully describes the config symbol") > > Good old checkpatch :D ;-) ;-) ;-) > > > endmenu # "RCU Subsystem" > > > diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c > > > index c54ea2b6a36b..627a3218a372 100644 > > > --- a/kernel/rcu/rcu_segcblist.c > > > +++ b/kernel/rcu/rcu_segcblist.c > > > @@ -20,6 +20,7 @@ void rcu_cblist_init(struct rcu_cblist *rclp) > > > rclp->head = NULL; > > > rclp->tail = &rclp->head; > > > rclp->len = 0; > > > + rclp->lazy_len = 0; > > > } > > > > > > /* > > > @@ -32,6 +33,15 @@ void rcu_cblist_enqueue(struct rcu_cblist *rclp, struct rcu_head *rhp) > > > WRITE_ONCE(rclp->len, rclp->len + 1); > > > } > > > > > > +/* > > > + * Enqueue an rcu_head structure onto the specified callback list. > > > > Please also note the fact that it is enqueuing lazily. > > Sorry, done. > > > > + */ > > > +void rcu_cblist_enqueue_lazy(struct rcu_cblist *rclp, struct rcu_head *rhp) > > > +{ > > > + rcu_cblist_enqueue(rclp, rhp); > > > + WRITE_ONCE(rclp->lazy_len, rclp->lazy_len + 1); > > > > Except... Why not just add a "lazy" parameter to rcu_cblist_enqueue()? > > IS_ENABLED() can make it fast. > > Yeah good idea, it simplifies the code too. Thank you! > > So you mean I should add in this function so that the branch gets optimized: > if (lazy && IS_ENABLE(CONFIG_RCU_LAZY)) { > ... > } > > That makes total sense considering the compiler may otherwise not be able to > optimize the function viewing just the individual translation unit. I fixed > it. Or the other way around: if (IS_ENABLE(CONFIG_RCU_LAZY) && lazy) { Just in case the compiler is stumbling over its boolean logic. Or in case the human reader is. ;-) > The 6 month old baby and wife are calling my attention now. I will continue > to reply to the other parts of this and other emails this evening and thanks > for your help! Ah, for those who believe that SIGCHLD can be ignored in real life! ;-) Thanx, Paul