On Tue, Jun 22, 2021 at 3:37 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > On Mon, 14 Jun 2021 10:50:29 +0300 > "Tzvetomir Stoyanov (VMware)" <tz.stoyanov@xxxxxxxxx> wrote: > > > Updated the trace.dat man page with the changes related to file > > version 7 and compression. > > > > Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@xxxxxxxxx> > > --- > > Documentation/trace-cmd/trace-cmd.dat.5.txt | 56 ++++++++++++++++++--- > > 1 file changed, 50 insertions(+), 6 deletions(-) > > > > diff --git a/Documentation/trace-cmd/trace-cmd.dat.5.txt b/Documentation/trace-cmd/trace-cmd.dat.5.txt > > index 8d285353..e80d460e 100644 > > --- a/Documentation/trace-cmd/trace-cmd.dat.5.txt > > +++ b/Documentation/trace-cmd/trace-cmd.dat.5.txt > > @@ -52,6 +52,23 @@ INITIAL FORMAT > > The next 4 bytes are a 32-bit word that defines what the traced > > host machine page size was. > > > > + If the file version is 7 or greater, the compression header is > > + written next: > > + "name version\0" > > I wonder if we should make it: "name\0version\0" > > Also, I think "none" is acceptable, where none of the sections are > compressed. If we add something special for a version 7 but don't want > to compress, we need to support that. > > > + where "name" and "version" are strings, name and version of the > > + compression algorithm used to compress the trace file. > > + > > +COMPRESSION FORMAT OF THE HEADER SECTIONS > > +----------------------------------------- > > + If the file version is 7 or greater, some header sections are compressed > > + with the compression algorithm, specified in the compression header. > > + The format of these compressed sections is: > > + <4 bytes> unsigned int, size of compressed data in the next block. > > + <4 bytes> unsigned int, size of uncompressed data. > > + <data> binary compressed data, with the specified size. > > + These sections must be uncompressed on reading. The described format of > > + the sections refers to the uncomperssed data. > > I think each section should have a flag that states that it is > compressed or not. That way we could have options that determine "what" > gets compressed, and not have it be all or none. I was thinking the same, but could not find a use case. That means to give control to the user to decide what parts should be compressed. This will complicate the implementation, new trace-cmd parameters should be added. As I couldn't thought of a use case, decided to go with the simpler approach. May be it makes sense only for the trace data, but the metadata should be always compressed if possible. > > > + > > HEADER INFO FORMAT > > ------------------ > > > > @@ -93,7 +110,8 @@ FTRACE EVENT FORMATS > > > > Directly after the header information comes the information about > > the Ftrace specific events. These are the events used by the Ftrace plugins > > - and are not enabled by the event tracing. > > + and are not enabled by the event tracing. If the file version is 7 or > > + greater, this section is compressed. > > Perhaps add a single byte ahead of each section, where "0" is not > compressed, and "1" is compressed? > > > > > The next 4 bytes contain a 32-bit word of the number of Ftrace event > > format files that are stored in the file. > > @@ -110,7 +128,8 @@ EVENT FORMATS > > ------------- > > > > Directly after the Ftrace formats comes the information about > > - the event layout. > > + the event layout. If the file version is 7 or greater, this section > > + is compressed. > > > > The next 4 bytes are a 32-bit word containing the number of > > event systems that are stored in the file. These are the > > @@ -137,7 +156,8 @@ KALLSYMS INFORMATION > > -------------------- > > > > Directly after the event formats comes the information of the mapping > > - of function addresses to the function names. > > + of function addresses to the function names. If the file version is 7 > > + or greater, this section is compressed. > > > > The next 4 bytes are a 32-bit word containing the size of the > > data holding the function mappings. > > @@ -154,6 +174,7 @@ TRACE_PRINTK INFORMATION > > store the format string outside the ring buffer. > > This information can be found in: > > debugfs/tracing/printk_formats > > + If the file version is 7 or greater, this section is compressed. > > > > The next 4 bytes are a 32-bit word containing the size of the > > data holding the printk formats. > > @@ -166,7 +187,8 @@ PROCESS INFORMATION > > ------------------- > > > > Directly after the trace_printk formats comes the information mapping > > - a PID to a process name. > > + a PID to a process name. If the file version is 7 or greater, this > > + section is compressed. > > > > The next 8 bytes contain a 64-bit word that holds the size of the > > data mapping the PID to a process name. > > @@ -193,10 +215,11 @@ REST OF TRACE-CMD HEADER > > > > "flyrecord\0" > > > > - If it is "options \0" then: > > + If it is "options \0" then follows a section with trace options. > > + If the file version is 7 or greater, this section is compressed. > > > > The next 2 bytes are a 16-bit word defining the current option. > > - If the the value is zero then there are no more options. > > + If the value is zero then there are no more options. > > > > Otherwise, the next 4 bytes contain a 32-bit word containing the > > option size. If the reader does not know how to handle the option > > @@ -206,6 +229,25 @@ REST OF TRACE-CMD HEADER > > The next option will be directly after the previous option, and > > the options ends with a zero in the option type field. > > > > +COMPRESSION FORMAT OF THE TRACE DATA > > +------------------------------------ > > + > > + If the file version is 7 or greater, the tarce data is compressed > > Typo "trace data" > > And this is where we definitely need to make it optional. We currently > do not have a safe way to read this file. The "uncompress to /tmp" is > not a reliable way to do this. And again, people can likely want to > have the header compressed but not the data, due to speed in reading. > > -- Steve > > > + with the compression algorithm, specified in the compression header. > > + The data is compressed in chunks. The size of one compression chunk > > + is defined when the file is written. The format of compressed trace > > + data is: > > + <4 bytes> unsigned int, count of chunks. > > + Follows the compressed chunks of givent count. For each chunk: > > + <4 bytes> unsigned int, size of compressed data in this chunk. > > + <4 bytes> unsigned int, size of uncompressed data. > > + <data> binary compressed data, with the specified size. > > + These chunks must be uncompressed on reading. The described format of > > + trace data refers to the uncomperssed data. > > + > > +TRACE DATA > > +---------- > > + > > The next 10 bytes after the options are one of the following: > > > > "latency \0" > > @@ -217,6 +259,7 @@ REST OF TRACE-CMD HEADER > > If the value is "latency \0", then the rest of the file is > > simply ASCII text that was taken from the target's: > > debugfs/tracing/trace > > + If the file version is 7 or greater, the latency data is compressed. > > > > If the value is "flyrecord\0", the following is present: > > > > @@ -232,6 +275,7 @@ REST OF TRACE-CMD HEADER > > CPU DATA > > -------- > > > > + If the file version is 7 or greater, the CPU data is compressed. > > The CPU data is located in the part of the file that is specified > > in the end of the header. Padding is placed between the header and > > the CPU data, placing the CPU data at a page aligned (target page) position > -- Tzvetomir (Ceco) Stoyanov VMware Open Source Technology Center