Re: SRCU question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Nov 15, 2020 at 12:11:24PM -0800, Paul E. McKenney wrote:
> On Thu, Nov 12, 2020 at 03:15:47PM -0500, Kent Overstreet wrote:
> > Hi to Paul & the rcu mailing list,
> > 
> > I've got a problem I'm trying to figure out if I can adapt SRCU for.
> > 
> > Within bcachefs, currently struct btree obects, and now also btree_cached_key,
> > aren't freed until the filesystem is torn down, because btree iterators contain
> > pointers to them and will drop and retake locks (iff a sequence number hasn't
> > changed) without holding a ref. With the btree key cache code, this is now
> > something I need to fix.
> > 
> > What I plan on doing is having struct btree_trans (container for btree
> > iterators) hold an srcu read lock while it's alive. But, I don't want to just
> > use call_srcu to free the btree/btree_cached_key objects, because btree trans
> > objects can at times be fairly long lived and the existing code can reuse these
> > objects for other btree nodes/btree cached keys immediately. Freeing them with
> > call_srcu() would break that; I could have my callback function check if the
> > object has been reused before freeing, but I'd still have a problem when the
> > object gets freed a second time before the first call_scru() has finished.
> > 
> > What I'm wondering is if the SRCU code has a well defined notion of a clock that
> > I could make use of. What I would like to do is, instead of doing call_srcu() to
> > free the object - just mark that object with the current SRCU context time, and
> > then when my shrinkers run they can free objects that haven't been reused and
> > are old enough according the the current SRCU time.
> > 
> > Thoughts?
> 
> An early prototype is available on -rcu [1].  The Tree SRCU version
> seems to be reasonably reliable, but I do not yet trust the Tiny SRCU
> implementation.  So please avoid !SMP&&!PREEMPT if you would like to
> give it a try.  Unless you -really- like helping me find bugs, in which
> case full speed ahead!!!
> 
> Here is the API:
> 
> unsigned long start_poll_synchronize_srcu(struct srcu_struct *ssp)
> 
> 	Returns a "cookie" that can be thought of as a snapshot of your
> 	"clock" above.	(SRCU calls it a "grace-period sequence number".)
> 	Also ensures that enough future grace periods happen to eventually
> 	make the grace-period sequence number reach the cookie.
> 
> bool poll_state_synchronize_srcu(struct srcu_struct *ssp, unsigned long cookie)
> 
> 	Given a cookie from start_poll_synchronize_srcu(), returns true if
> 	at least one full SRCU grace period has elapsed in the meantime.
> 	Given finite SRCU readers in a well-behaved kernel, the following
> 	code will complete in finite time:
> 
> 		cookie = start_poll_synchronize_srcu(&my_srcu);
> 		while (!poll_state_synchronize_srcu(&my_srcu, cookie))
> 			schedule_timeout_uninterruptible(1);
> 
> unsigned long get_state_synchronize_srcu(struct srcu_struct *ssp)
> 
> 	Like start_poll_synchronize_srcu(), except that it does not start
> 	any grace periods.  This means that the following code is -not-
> 	guaranteed to complete:
> 
> 		cookie = get_state_synchronize_srcu(&my_srcu);
> 		while (!poll_state_synchronize_srcu(&my_srcu, cookie))
> 			schedule_timeout_uninterruptible(1);
> 
> 	Use this if you know that something else will be starting the
> 	needed SRCU grace periods.  This might also be useful if you
> 	had items that were likely to be reused before the SRCU grace
> 	period elapsed, so that you avoid burning CPU on SRCU grace
> 	periods that prove to be unnecessary.  Or if you don't want
> 	to have more than (say) 100 SRCU grace periods per seconds,
> 	in which case you might use a timer to start the grace periods.
> 	Or maybe you don't bother starting the SRCU grace period until
> 	some sort of emergency situation has arisen.  Or...
> 
> 	OK, maybe you don't need it, but I do need it for rcutorture,
> 	so here it is anyway.
> 
> All of these can be invoked anywhere that call_srcu() can be invoked.
> 
> Does this look like it will work for you?

Oh, and due to historical inertia, Tiny SRCU's grace-period sequence
number is only 16 bits.  I can change this easily, but I need to know
that it is a real problem for you before I can do so.

The potential problem for you is that if you let a given cookie lie
dormant for 16384 grace periods, it will take another 16385 grace
periods for get_state_synchronize_srcu() to say that a grace period
has elapsed.

In contrast, Tree SRCU's grace-period sequence number is either 32 bits
or 64 bits, depending on the size of unsized long.

							Thanx, Paul

> [1] git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux