Re: [PATCH RFC] v2 expedited "big hammer" RCU grace periods

"Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> · Sun, 26 Apr 2009 14:44:51 -0700

On Sun, Apr 26, 2009 at 10:22:55PM +0200, Ingo Molnar wrote:
> 
> * Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx> wrote:
> 
> > * Ingo Molnar (mingo@xxxxxxx) wrote:
> > > 
> > > * Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> > > 
> > > > Second cut of "big hammer" expedited RCU grace periods, but only 
> > > > for rcu_bh.  This creates another softirq vector, so that entering 
> > > > this softirq vector will have forced an rcu_bh quiescent state (as 
> > > > noted by Dave Miller).  Use smp_call_function() to invoke 
> > > > raise_softirq() on all CPUs in order to cause this to happen.  
> > > > Track the CPUs that have passed through a quiescent state (or gone 
> > > > offline) with a cpumask.
> > > 
> > > hm, i'm still asking whether doing this would be simpler via a 
> > > reschedule vector - which not only is an existing facility but also 
> > > forces all RCU domains through a quiescent state - not just bh-RCU 
> > > participants.
> > > 
> > > Triggering a new softirq is in no way simpler that doing an SMP 
> > > cross-call - in fact softirqs are a finite resource so using some 
> > > other facility would be preferred.
> > > 
> > > Am i missing something?
> > > 
> > 
> > I think the reason for this whole thread is that waiting for rcu 
> > quiescent state, when called many times e.g. in multiple iptables 
> > invokations, takes too longs (5 seconds to load the netfilter 
> > rules at boot). [...]
> 
> I'm aware of the problem space.
> 
> I was suggesting that to trigger the quiescent state and to wait for 
> it to propagate it would be enough to reuse the reschedule 
> mechanism.
> 
> It would be relatively straightforward: first a send-reschedule then 
> do a wait_task_context_switch() on rq->curr - both are existing 
> primitives. (a task reference has to be taken but that's pretty much 
> all)

Well, one reason I didn't take this approach was that I didn't happen
to think of it.  ;-)

Also that I hadn't heard of wait_task_context_switch().

Hmmm...  Looking for wait_task_context_switch().  OK, found it.

It looks to me that this primitive won't return until the scheduler
actually decides to run something else.  We instead need to have
something that stops waiting once the CPU enters the scheduler, hence
the previous thought of making rcu_qsctr_inc() do a bit of extra work.

This would be a way of making an expedited RCU-sched across all
RCU implementations.  As noted in the earlier email, it would not
handle RCU or RCU-bh in a -rt kernel.

> By the time wait_task_context_switch() returns from the last CPU we 
> know that the quiescent state has passed.

We would want to wait for all of the CPUs in parallel, though, wouldn't
we?  Seems that we would not want to wait for the last CPU to do another
trip through the scheduler if it had already passed through the scheduler
while we were waiting on the earlier CPUs.

So it seems like we would still want a two-pass approach -- one pass to
capture the current state, the second pass to wait for the state to
change.

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html