Re: [PATCH] rcu: Make jiffies_till_sched_qs writable

Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> · Fri, 12 Jul 2019 09:43:11 -0400

On Fri, Jul 12, 2019 at 06:02:42AM -0700, Paul E. McKenney wrote:
> On Fri, Jul 12, 2019 at 08:51:16AM -0400, Joel Fernandes wrote:
> > On Fri, Jul 12, 2019 at 03:32:40PM +0900, Byungchul Park wrote:
> > > On Thu, Jul 11, 2019 at 03:58:39PM -0400, Joel Fernandes wrote:
> > > > Hmm, speaking of grace period durations, it seems to me the maximum grace
> > > > period ever is recorded in rcu_state.gp_max. However it is not read from
> > > > anywhere.
> > > > 
> > > > Any idea why it was added but not used?
> > > > 
> > > > I am interested in dumping this value just for fun, and seeing what I get.
> > > > 
> > > > I wonder also it is useful to dump it in rcutorture/rcuperf to find any
> > > > issues, or even expose it in sys/proc fs to see what worst case grace periods
> > > > look like.
> > > 
> > > Hi,
> > > 
> > > 	commit ae91aa0adb14dc33114d566feca2f7cb7a96b8b7
> > > 	rcu: Remove debugfs tracing
> > > 
> > > removed all debugfs tracing, gp_max also included.
> > > 
> > > And you sounds great. And even looks not that hard to add it like,
> > > 
> > > :)
> > > 
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index ad9dc86..86095ff 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -1658,8 +1658,10 @@ static void rcu_gp_cleanup(void)
> > >  	raw_spin_lock_irq_rcu_node(rnp);
> > >  	rcu_state.gp_end = jiffies;
> > >  	gp_duration = rcu_state.gp_end - rcu_state.gp_start;
> > > -	if (gp_duration > rcu_state.gp_max)
> > > +	if (gp_duration > rcu_state.gp_max) {
> > >  		rcu_state.gp_max = gp_duration;
> > > +		trace_rcu_grace_period(something something);
> > > +	}
> > 
> > Yes, that makes sense. But I think it is much better off as a readable value
> > from a virtual fs. The drawback of tracing for this sort of thing are:
> >  - Tracing will only catch it if tracing is on
> >  - Tracing data can be lost if too many events, then no one has a clue what
> >    the max gp time is.
> >  - The data is already available in rcu_state::gp_max so copying it into the
> >    trace buffer seems a bit pointless IMHO
> >  - It is a lot easier on ones eyes to process a single counter than process
> >    heaps of traces.
> > 
> > I think a minimal set of RCU counters exposed to /proc or /sys should not
> > hurt and could do more good than not. The scheduler already does this for
> > scheduler statistics. I have seen Peter complain a lot about new tracepoints
> > but not much (or never) about new statistics.
> > 
> > Tracing has its strengths but may not apply well here IMO. I think a counter
> > like this could be useful for tuning of things like the jiffies_*_sched_qs,
> > the stall timeouts and also any other RCU knobs. What do you think?
> 
> Is this one of those cases where eBPF is the answer, regardless of
> the question?  ;-)

It could be. Except that people who fancy busybox still could benefit from
the counter ;-)

And also, eBPF uses RCU itself heavily, so it would help if the debug related
counter itself didn't depend on it. ;-)

thanks,

- Joel