Re: rcu pending

"Paul E. McKenney" <paulmck@xxxxxxxxxx> · Mon, 19 Aug 2024 07:03:05 -0700

On Mon, Aug 19, 2024 at 12:46:34AM -0400, Kent Overstreet wrote:
> On Sun, Aug 18, 2024 at 09:40:27PM GMT, Paul E. McKenney wrote:
> > On Mon, Aug 19, 2024 at 12:16:42AM -0400, Kent Overstreet wrote:
> > > You can't use a fixed number of callback heads if there's going to be an
> > > unbounded number of callback heads outstanding.
> > 
> > The number of rcu_head structures is your choice, based on your choice
> > of data structure.  You can for example link together data elements that
> > have the same value of get_state_synchronize_rcu() cookie, and use a
> > single rcu_head structure for that group.  You could then do whatever
> > you want for the linking.
> > 
> > But even if you do choose to have a large number of rcu_head structures,
> > perhaps one per data element, there is no law saying that each and every
> > one of them needs to be passed to call_rcu().  For example, kfree_rcu()
> > requires an rcu_head structure in the objects passed to it, but in the
> > common (non-OOM) case, those structures go unused in favor of pages
> > of pointers.
> > 
> > So what were you really trying to get across to me here?
> 
> Like I said, this needs to run in a fixed amount of memory, so your
> proposed algorithm doesn't work for having one rcu_head per seq with
> pending objects.

Please define "fixed amount of memory" in this context.  Given the
information you have provided me, you might mean any number of things,
including the following:

1.	An upper bound on the time from the end of a grace period to
	the point at which you are notified of the end of that grace
	period.

2.	An upper bound on the time from the end of a grace period to
	the point at which you are notified of the end of that grace
	period when there is a "flood" of blocks of memory that start
	waiting for their grace period.

3.	An upper bound on the memory waiting to be freed as a function
	of the maximum duration of any reader, assuming some maximum rate
	at which blocks of memory start waiting for their grace period.

4.	An upper bound on the memory waiting to be freed as a function
	of the maximum duration of any reader, independent of the rate
	at which blocks of memory start waiting for their grace period.

5.	A guarantee that any block that is not actively in use by some
	reader can be freed, albeit at some additional read-side and
	update-side expense, and some added read-side complexity.

6.	Beyond this point, as far as I know, you would need to be living
	in a different universe having rather different laws of physics.

The definition I was working with is #2 above.

#3 requires that your code provide a maximum upper bound for the execution
time of all readers.  Even given such an upper bound, I don't know of
a way of doing #4 without massive read-side memory contention, with the
accompanying lack of performance and scalability.

#5 might (or might not) be addressed by something like hazard pointers.
The read-side computational expense is a full memory barrier in the
counterpart to rcu_dereference(), and also that rcu_dereference() can
say "no", which would require the reader to restart its traversal from
the beginning.  (This happens when a reader races with an updater that
removes the element that the reader was to traverse next.)  There is an
early prototype of hazard pointers under development, but in my opinion,
it is not yet production-ready.

So exactly what are you looking for here?

							Thanx, Paul