On Mon, 8 Oct 2018 18:07:49 +0200 Claudio <claudio.fontana@xxxxxxxxx> wrote: > Hello Steven, > > On 07/24/2018 04:25 PM, Steven Rostedt wrote: > > On Tue, 24 Jul 2018 10:23:16 -0400 > > Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > > >>> > >>> Would work in the direction of adding a global trace_pipe_raw be considered > >>> for inclusion? > >> > >> The design of the lockless ring buffer requires not to be preempted, > >> and that the data cannot be written to from more than one location. To > >> do so, we make a per CPU buffer, and disable preemption when writing. > >> This means that we have only one writer at a time. It can handle > >> interrupts and NMIs, because they will finish before they return and > >> this doesn't break the algorithm. But having writers from multiple CPUs > >> would require locking or other heaving synchronization operations that > >> will greatly reduce the speed of writing to the buffers (not to mention > >> the cache thrashing). > > > > And why would you need a single buffer? Note, we are working on making > > libtracecmd.so that will allow applications to read the buffers and the > > library will take care of the interleaving of the raw data. This should > > hopefully be ready in about three months or so. > > > > -- Steve > > > > Is this something you will showcase in the linux tracing summit? > Is there a repo / branch I should be following? We are preparing the code in tools/lib/traceevent of the Linux kernel to turn that into a library. At the same time, we are looking at making libtracecmd or perhaps we'll call it libftrace? to implement all the trace-cmd code as a library as well. But that's happening in the main trace-cmd repo: git://git.kernel.org/pub/scm/utils/trace-cmd/trace-cmd.git > > The reason why we need to end up with a single stream of events is to be > able to do "online" task state correlation and timing parameters calculations > for all task-related events independent of cores. Well, all the events are timestamped, and you can pick different clocks to use, and a simple merge sort gives all the information you need. Note, having per cpu buffers makes things much more efficient as you don't need to do synchronizing with atomics. -- Steve > > Currently we have this on QNX, and we are trying to enable it for Linux as well. > > Thank you, > > Claudio