Re: [PATCH RFC tip/core/rcu 0/16] Prototype RCU usable from idle, exception, offline

"Paul E. McKenney" <paulmck@xxxxxxxxxx> · Mon, 16 Mar 2020 08:39:42 -0700

On Mon, Mar 16, 2020 at 03:45:36PM +0100, Frederic Weisbecker wrote:
> On Fri, Mar 13, 2020 at 08:42:43AM -0700, Paul E. McKenney wrote:
> > On Fri, Mar 13, 2020 at 03:41:46PM +0100, Frederic Weisbecker wrote:
> > > On Thu, Mar 12, 2020 at 11:16:18AM -0700, Paul E. McKenney wrote:
> > > > Hello!
> > > > 
> > > > This series provides two variants of Tasks RCU, a rude variant inspired
> > > > by Steven Rostedt's use of schedule_on_each_cpu(), and a tracing variant
> > > > requested by the BPF folks and perhaps also of use for other tracing
> > > > use cases.
> > > > 
> > > > The tracing variant has explicit read-side markers to permit finite grace
> > > > periods even given in-kernel loops in PREEMPT=n builds It also protects
> > > > code in the idle loop, on exception entry/exit paths, and on the various
> > > > CPU-hotplug online/offline code paths, thus having protection properties
> > > > similar to SRCU.  However, unlike SRCU, this variant avoids expensive
> > > > instructions in the read-side primitives, thus having read-side overhead
> > > > similar to that of preemptible RCU.
> > > > 
> > > > There are of course downsides.  The grace-period code can send IPIs to
> > > > CPUs, even when those CPUs are in the idle loop or in nohz_full userspace.
> > > > It is necessary to scan the full tasklist, much as for Tasks RCU.  There
> > > > is a single callback queue guarded by a single lock, again, much as for
> > > > Tasks RCU.  If needed, these downsides can be at least partially remedied
> > > 
> > > So what we trade to fix the issues we are having with tracing against extended
> > > grace periods, we lose in CPU isolation. That worries me a bit as tracing can
> > > be thoroughly used with nohz_full and CPU isolation.
> > 
> > First, disturbing nohz_full CPUs can be avoided by the sysadm simply
> > refusing to remove tracepoints while sensitive applications are running
> > on nohz_full CPUs.
> 
> So, in that case we'll need to modify the tools such as perf tools to avoid
> releasing the related buffers until we are ready to do so.
> 
> That's possible but it's kindof an ABI breakage. Also what if there is a
> long running service on that nohz full CPU polling on the networking card...

In the near term, I do admit that Mathieu's point about using smp_mb()
in readers but only on nohz_full CPUs is attractive.

I have some other ideas, but simplicity has its advantages, and if no
one complains, perhaps those advantages are also good for the long term.

							Thanx, Paul