* Ingo Molnar (mingo@xxxxxxx) wrote: > > * Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote: > > > Second cut of "big hammer" expedited RCU grace periods, but only > > for rcu_bh. This creates another softirq vector, so that entering > > this softirq vector will have forced an rcu_bh quiescent state (as > > noted by Dave Miller). Use smp_call_function() to invoke > > raise_softirq() on all CPUs in order to cause this to happen. > > Track the CPUs that have passed through a quiescent state (or gone > > offline) with a cpumask. > > hm, i'm still asking whether doing this would be simpler via a > reschedule vector - which not only is an existing facility but also > forces all RCU domains through a quiescent state - not just bh-RCU > participants. > > Triggering a new softirq is in no way simpler that doing an SMP > cross-call - in fact softirqs are a finite resource so using some > other facility would be preferred. > > Am i missing something? > I think the reason for this whole thread is that waiting for rcu quiescent state, when called many times e.g. in multiple iptables invokations, takes too longs (5 seconds to load the netfilter rules at boot). The three solutions proposed so far are : - bh disabling + per-cpu read-write lock - RCU FGP (fast grace periods), where the writer directly check each per-cpu variables associated with netfilter to make sure the quiescent state for a particular resource has been reached. (derived from my userspace RCU implementation) - expedited "big hammer" rcu GP, where the writer only waits for bh quiescent state. This is useful if we can guarantee that all readers are either in bh context or disable bottom halves. Therefore, it's really on purpose that Paul does not wait for global RCU quiescent states, but rather just for bh : it's faster. IMHO, the bh rcu GP shares the same problem as the global RCU GP approach : it monitors global kernel state to ensure quiescence. It's better in practice because bh quiescent states are much more frequent than global QS, but it still depends on every other bh handler and bh disabled section duration to calculate the maximum writer delay. One might argue that if we keep those small, this should not matter in practice. The RCU FGP approach is interesting because it's based solely on netfilter-specific per-cpu variables to detect QS. Therefore, even if an unrelated piece of kernel software eventually decides to be a bad citizen and disable bh for long periods on a 4096 cpu box, it won't slow down the netfilter tables update. This last positive aspect of RCU FGP is common with the bh disabling + per-cpu rw lock approach, where the rw lock is also local to netfilter. However, taking a rwlock and disabling bh will make the read-side much slower than RCU FGP (which simply disables preemption and touches a per-cpu GP/nesting count variable). But given RCU FGP is relatively new, it makes sense to use a known-good solution in the short term. Mathieu -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html