Re: [PATCH] srcu: Add notes for not-effective mutex

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Dec 18, 2022 at 09:53:13AM -0800, Paul E. McKenney wrote:
> On Wed, Dec 14, 2022 at 10:24:46PM +0800, Pingfan Liu wrote:
> > Capped by the design and implementation of workqueue, the same work can not
> > be re-entrant.
> > 
> > SRCU has only a single work ssp->work, there is no way to occur the
> > concurrent scene in its state machine. Hence the mutexes for the
> > concurrency in the state machine is unnecessary. On the other hand,
> > using two works to utilize the window from the release of srcu_gp_mutex
> > to the end of srcu_gp_end() to enhance the parallelism has limited
> > effect. [1]
> > 
> > To save any trying of improving parallelism by using two work_struct,
> > put some notes around the srcu_gp_mutex and srcu_cb_mutex. And keep them
> > instead of removing them, in case in future, the state machine is moved
> > out of the protection of workqueue non-reentrant protection.
> 
> Thank you for continuning to push on this!
> 
> But let's step back and take a wider view, plus summarize a bit.
> 
> First, as we have discussed, it is necessary to make RCU work not just
> right now, but also to keep it from becoming too fragile.  This notion of
> "fragile" isn't yet something that can be reasonably measured, but one
> could unreasonably measure it by either comparing the number of words
> of English required to explain why it works on the one hand and making
> lots of random changes and seeing what fraction of those changes break
> something on the other.  These unreasonable measures are, just like the
> name says, unreasonable.  But I hope that they give an intuitive notion
> of what the goal must be.  Again, I don't know of a reasonable scientific
> (let alone mathematical) way of expressing it.
> 

Yes, I get the point of srcu_gp_mutex, which keeps the code from
becoming too fragile. But the carefully designed srcu_cb_mutex may leave
room for people to have fantasy. My purpose just reminders someone in
future it is known and tried.

> As you noted, the current state is unsatisfying.  There are two different
> mechanisms providing mutual exclusion.  This code is nowhere near a
> fastpath, so the added overhead is absolutely *not* a concern (not yet,
> that we know of, anyway), but there are still risks, for example:
> 
> 1.	Latent bugs could be introduced that suddenly make their presence
> 	known should the code ever be relying solely on the mutexes for
> 	mutual exclusion.
> 
> 2.	People imagine other purposes for either the single workqueue
> 	or the mutexes, and inject bugs based on the fact that their
> 	imagination of the design has diverged from the actual design.
> 
> These risks are not huge, especially compared to other risks in the
> code base.  For but one example, the code that provides the memory-order
> guarantees for synchronize_rcu() are intricate and spread widely over the
> code.  As they must be in order to make important fastpaths fast, which
> is a benefit that is important enough to make taking that risk worthwhile.
> 

As said above, I understood this and srcu_gp_mutex is not my target. (I
am sorry that my comment may give wrong impression)

> So what are the options here?
> 
> 1.	Leave it as it is, with the non-huge risks as called out above.
> 

Due to the test data, it is dim for the outlook of this patch.

I am fine with option one. But if necessary, I am also OK with some
comment around srcu_cb_mutex, which may save someone's time in future.


Thanks,

	Pingfan

> 2.	Add comments, as in this commit.  One risk here is that someone
> 	adds the concurrency, but fails to remove the comments, which
> 	are now obsolete and confusing.
> 
> 3.	Add words describing the current double-provided mutual exclusion
> 	to one of the RCU design documents.
> 
> 4.	Leave mainline and -rcu as they are, but add your
> 	multiple-workqueue patch to one of the RCU design documents
> 	covering SRCU so that it doesn't get lost should it ever become
> 	necessary to speed up SRCU grace periods.
> 
> 5.	Leave mainline as it is, but add your multiple-workqueue patch to
> 	-rcu with a not-for-mainline branch marking its presence so that
> 	this change can be easily applied later, if needed.  (For another
> 	example of this, see the sysidle.2017.05.11a branch, which was
> 	actually in mainline for some time and then later removed because
> 	it was not being used.)
> 
> 6.	Make the workqueue supply concurrency, as in your earlier patch.
> 	As noted in the email threads, this adds risk for no known
> 	user-visible benefit.
> 
> 7.	Remove the mutexes, and rely on the workqueue for the needed
> 	mutual exclusion.  This is unlikely to make the code more
> 	readable, and makes it more difficult to add concurrency if it
> 	is ever needed.
> 
> 8.	Make the workqueue supply concurrency, but only when a given
> 	kernel-boot parameter or Kconfig option is supplied.  Enable
> 	this concurrency only during selected rcutorture tests.
> 	This has the advantage of actually testing the placement and
> 	usage of the mutexes, but on the other hand it increases test
> 	effort for no known end-user-visible benefit.
> 
> Are there other options?
> 
> 							Thanx, Paul
> 
> > Some test data to show how dual work_struct affects the SRCU
> > performance:
> > 
> > -1. on 144 cpus hpe-dl560gen10
> >   modprobe rcutorture   torture_type=srcud  fwd_progress=144 fwd_progress_holdoff=1 shutdown_secs=36000 stat_interval=60 verbose=1
> >   # base line
> >   [37587.632155] srcud:  End-test grace-period state: g28287244 f0x0 total-gps=28287244
> >   # two dual work_struct
> >   [36443.468017] srcud:  End-test grace-period state: g29026056 f0x0 total-gps=29026056
> > 
> > -2. on 256 cpus amd-milan
> >   modprobe rcutorture   torture_type=srcud  fwd_progress=256 fwd_progress_holdoff=1 shutdown_secs=36000 stat_interval=60 verbose=1
> >   # base line
> >   [36093.605732] srcud:  End-test grace-period state: g10850284 f0x0 total-gps=10850284
> >   # two dual work_struct
> >   [36093.856713] srcud:  End-test grace-period state: g10672632 f0x0 total-gps=10672632
> > 
> > The first test shows that it has about 2.6% improvement, while the
> > second test shows it has no significant effect.
> > 
> > [1]: https://lore.kernel.org/rcu/Y5nC2yBhaUYirByo@xxxxxxxxxxxxxxxxxxxxxxxxxx/T/#maf99cf1d92077a1b0b8e5c411c79fa4444576624
> > 
> > Signed-off-by: Pingfan Liu <kernelfans@xxxxxxxxx>
> > Cc: Lai Jiangshan <jiangshanlai@xxxxxxxxx>
> > Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxx>
> > Cc: Frederic Weisbecker <frederic@xxxxxxxxxx>
> > Cc: Josh Triplett <josh@xxxxxxxxxxxxxxxx>
> > Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
> > Cc: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> > Cc: "Zhang, Qiang1" <qiang1.zhang@xxxxxxxxx>
> > Cc: Boqun Feng <boqun.feng@xxxxxxxxx>
> > Cc: Joel Fernandes <joel@xxxxxxxxxxxxxxxxx>
> > To: rcu@xxxxxxxxxxxxxxx
> > ---
> >  include/linux/srcutree.h | 4 ++++
> >  kernel/rcu/srcutree.c    | 5 ++++-
> >  2 files changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
> > index 558057b517b7..f199c31493ab 100644
> > --- a/include/linux/srcutree.h
> > +++ b/include/linux/srcutree.h
> > @@ -66,8 +66,12 @@ struct srcu_struct {
> >  						/* First node at each level. */
> >  	int srcu_size_state;			/* Small-to-big transition state. */
> >  	struct mutex srcu_cb_mutex;		/* Serialize CB preparation. */
> > +						/* Not take effect since the only single sdp->work */
> > +						/* meets the workqueue non-reentrant condition */
> >  	spinlock_t __private lock;		/* Protect counters and size state. */
> >  	struct mutex srcu_gp_mutex;		/* Serialize GP work. */
> > +						/* Not take effect since the only single sdp->work */
> > +						/* meets the workqueue non-reentrant condition */
> >  	unsigned int srcu_idx;			/* Current rdr array element. */
> >  	unsigned long srcu_gp_seq;		/* Grace-period seq #. */
> >  	unsigned long srcu_gp_seq_needed;	/* Latest gp_seq needed. */
> > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> > index ce39907fa381..8ebb1a240219 100644
> > --- a/kernel/rcu/srcutree.c
> > +++ b/kernel/rcu/srcutree.c
> > @@ -815,7 +815,10 @@ static void srcu_gp_end(struct srcu_struct *ssp)
> >  	struct srcu_node *snp;
> >  	int ss_state;
> >  
> > -	/* Prevent more than one additional grace period. */
> > +	/*
> > +	 * Prevent more than one additional grace period. But at present, it does
> > +	 * not take effect. Refer to the note at the definition.
> > +	 * /
> >  	mutex_lock(&ssp->srcu_cb_mutex);
> >  
> >  	/* End the current grace period. */
> > -- 
> > 2.31.1
> > 



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux