Re: [RFC v1 01/14] rcu: Add a lock-less lazy RCU implementation

Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> · Tue, 31 May 2022 18:51:48 +0000

On Tue, May 31, 2022 at 09:45:34AM -0700, Paul E. McKenney wrote:
[..] 
> > Example:
> > 1. Say 5 lazy CBs queued onto bypass list (while the regular cblist is
> > empty).
> > 2. Now say 10000 non-lazy CBs are queued. As per the comments, these
> > have to go to the bypass list to keep rcu_barrier() from breaking.
> > 3. Because this causes the bypass list to overflow, all the lazy +
> > non-lazy CBs have to flushed to the main -cblist.
> > 
> > If only the non-lazy CBs are flushed, rcu_barrier() might break. If all
> > are flushed, then the lazy ones lose their laziness property as RCU will
> > be immediately kicked off to process GPs on their behalf.
> 
> Exactly why is this loss of laziness a problem?  You are doing that
> grace period for the 10,000 non-lazy callbacks anyway, so what difference
> could the five non-lazy callbacks possibly make?

It does not make any difference, I kind of answered my own question. I was
thinking out loud in this thread (Sorry).

> > This can fixed by making rcu_barrier() queue both a lazy and non-lazy
> > CB, and only flushing the non-lazy CBs on a bypass list overflow, to the
> > ->cblist, I think.
> 
> I don't see anything that needs fixing.  If you are doing a grace period
> anyway, just process the lazy callbacks along with the non-lazy callbacks.
> After all, you are paying for that grace period anyway.  And handling
> the lazy callbacks with that grace period means that you don't need a
> later grace period for those five lazy callbacks.  So running the lazy
> callbacks into the grace period required by the non-lazy callbacks is
> a pure win, right?
> 
> If it is not a pure win, please explain exactly what is being lost.

Agreed. As discussed on IRC, we can only care about increment of the lazy
length, and the flush will drop it to 1 or 0. No need to design for partial
flushing for now as no usecase.

> > Or, we flush both -lazy and non-lazy CBs to the ->cblist just to keep it
> > simple. I think that should be OK since if there are a lot of CBs queued
> > in a short time, I don't think there is much opportunity for power
> > savings anyway IMHO.
> 
> I believe that it will be simpler, faster, and more energy efficient to
> do it this way, flushing everything from the bypass list to ->cblist.
> Again, leaving the lazy callbacks lying around means that there must be a
> later battery-draining grace period that might not be required otherwise.

Perfect.

> > >> Currently the struct looks like this:
> > >>
> > >> struct rcu_segcblist {
> > >>         struct rcu_head *head;
> > >>         struct rcu_head **tails[RCU_CBLIST_NSEGS];
> > >>         unsigned long gp_seq[RCU_CBLIST_NSEGS];
> > >> #ifdef CONFIG_RCU_NOCB_CPU
> > >>         atomic_long_t len;
> > >> #else
> > >>         long len;
> > >> #endif
> > >>         long seglen[RCU_CBLIST_NSEGS];
> > >>         u8 flags;
> > >> };
> > >>
> > >> So now, it would need to be like this?
> > >>
> > >> struct rcu_segcblist {
> > >>         struct rcu_head *head;
> > >>         struct rcu_head **tails[RCU_CBLIST_NSEGS];
> > >>         unsigned long gp_seq[RCU_CBLIST_NSEGS];
> > >> #ifdef CONFIG_RCU_NOCB_CPU
> > >>         struct rcu_head *lazy_head;
> > >>         struct rcu_head **lazy_tails[RCU_CBLIST_NSEGS];
> > >>         unsigned long lazy_gp_seq[RCU_CBLIST_NSEGS];
> > >>         atomic_long_t lazy_len;
> > >> #else
> > >>         long len;
> > >> #endif
> > >>         long seglen[RCU_CBLIST_NSEGS];
> > >>         u8 flags;
> > >> };
> > > 
> > > I freely confess that I am not loving this arrangement.  Large increase
> > > in state space, but little benefit that I can see.  Again, what am I
> > > missing here?
> > 
> > I somehow thought tracking GPs separately for the lazy CBs requires
> > duplication of the rcu_head pointers/double-points in this struct. As
> > you pointed, just tracking the lazy len may be sufficient.
> 
> Here is hoping!
> 
> After all, if you thought that taking care of applications that need
> expediting of grace periods is scary, well, now...

Haha... my fear is I don't know all the applications requiring expedited GP
and I keep getting surprised by new RCU usages that pop up in the system, or
new systems.

For one, a number of tools and processes, use ftrace directly in the system,
and it may not be practical to chase down every tool. Some of them start
tracing randomly in the system. Handling it in-kernel itself would be best if
possible.

Productive email discussion indeed! On to writing the code :P

Thanks,

 - Joel