Re: ftrace global trace_pipe_raw

Steven Rostedt <rostedt@xxxxxxxxxxx> · Mon, 8 Oct 2018 12:16:18 -0400

On Mon, 8 Oct 2018 18:07:49 +0200
Claudio <claudio.fontana@xxxxxxxxx> wrote:

> Hello Steven,
> 
> On 07/24/2018 04:25 PM, Steven Rostedt wrote:
> > On Tue, 24 Jul 2018 10:23:16 -0400
> > Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> >   
> >>>
> >>> Would work in the direction of adding a global trace_pipe_raw be considered
> >>> for inclusion?    
> >>
> >> The design of the lockless ring buffer requires not to be preempted,
> >> and that the data cannot be written to from more than one location. To
> >> do so, we make a per CPU buffer, and disable preemption when writing.
> >> This means that we have only one writer at a time. It can handle
> >> interrupts and NMIs, because they will finish before they return and
> >> this doesn't break the algorithm. But having writers from multiple CPUs
> >> would require locking or other heaving synchronization operations that
> >> will greatly reduce the speed of writing to the buffers (not to mention
> >> the cache thrashing).  
> > 
> > And why would you need a single buffer? Note, we are working on making
> > libtracecmd.so that will allow applications to read the buffers and the
> > library will take care of the interleaving of the raw data. This should
> > hopefully be ready in about three months or so.
> > 
> > -- Steve
> >   
> 
> Is this something you will showcase in the linux tracing summit?
> Is there a repo / branch I should be following?

We are preparing the code in tools/lib/traceevent of the Linux kernel
to turn that into a library.

At the same time, we are looking at making libtracecmd or perhaps we'll
call it libftrace? to implement all the trace-cmd code as a library as
well. But that's happening in the main trace-cmd repo:

 git://git.kernel.org/pub/scm/utils/trace-cmd/trace-cmd.git

> 
> The reason why we need to end up with a single stream of events is to be
> able to do "online" task state correlation and timing parameters calculations
> for all task-related events independent of cores.

Well, all the events are timestamped, and you can pick different clocks
to use, and a simple merge sort gives all the information you need.
Note, having per cpu buffers makes things much more efficient as you
don't need to do synchronizing with atomics.

-- Steve

> 
> Currently we have this on QNX, and we are trying to enable it for Linux as well.
> 
> Thank you,
> 
> Claudio