Re: [PATCH RFC tip/core/rcu 09/16] rcu-tasks: Add an RCU-tasks rude variant

Steven Rostedt <rostedt@xxxxxxxxxxx> · Mon, 16 Mar 2020 18:03:52 -0400

On Mon, 16 Mar 2020 17:45:40 -0400
Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:

> >
> > Same for the function side (if not even more so). This would require adding
> > a srcu_read_lock() to all functions that can be traced! That would be a huge
> > kill in performance. Probably to the point no one would bother even using
> > function tracer.  
> 
> Point well taken! Thanks,

Actually, it's worse than that. (We talked about this on IRC but I wanted
it documented here too).

You can't use any type of locking, unless you insert it around all the
callers of the nops (which is unreasonable).

That is, we have gcc -pg -mfentry that creates at the start of all traced
functions:

 <some_func>:
    call __fentry__
    [code for function here]

At boot up (or even by the compiler itself) we convert that to:

 <some_func>:
    nop
    [code for function here]

When we want to trace this function we use text_poke (with current kernels)
and convert it to this:

 <some_func>:
    call trace_trampoline
    [code for function here]

That trace_trampoline can be allocated, which means when its no longer
needed, it must be freed. But when do we know it's safe to free it? Here's
the issue.

 <some_func>:
    call trace_trampoline  <- interrupt happens just after the jump
    [code for function here]

Now the task has just executed the call to the trace_trampoline. Which
means the instruction pointer is set to the start of the trampoline. But it
has yet executed that trampoline.

Now if the task is preempted, and a real time hog is keeping it from
running for minutes at a time (which is possible!). And in the mean time,
we are done with that trampoline and free it. What happens when that task
is scheduled back? There's no more trampoline to execute even though its
instruction pointer is to execute the first operand on the trampoline!

I used the analogy of jumping off the cliff expecting a magic carpet to be
there to catch you, and just before you land, it disappears. That would be
a very bad day indeed!

We have no way to add a grace period between the start of a function (can
be *any* function) and the start of the trampoline. Since the problem is
that the task was non-voluntarily preempted before it could execute the
trampoline, and that trampolines are not allowed (suppose) to call
schedule, then we have our quiescent state to track (voluntary scheduling).
When all tasks have either voluntarily scheduled, or entered user space
after disconnecting a trampoline from a function, we know that it is safe to
free the trampoline.

-- Steve