On Tue, 17 Oct 2023 09:34:30 +0200 (CEST) Julia Lawall <julia.lawall@xxxxxxxx> wrote: > > And exited out when it finished loading and the result was: > > > > real 6m14.772s > > user 6m0.649s > > sys 0m12.718s > > > > That's over 6 minutes to load the trace.dat file!!! > > > > I ran perf record on it and it showed 77% of the time was in free_zpage(). > > > > I pulled out my old algorithms book and wrote up a rbtree for internal use > > of libtracecmd. Then I switched the cache into a binary rbtree to do the > > look ups. As the lookups used both where the memory of the compressed page > > is mapped as well as the offset depending on how the search was done, I > > found that it only used the memory allocation address in one location. > > Luckily, the memory allocation mapping lookup also had access to the > > offset of the file the memory represented. That allowed me to make all > > lookups use the file offset (Thanks to Masami Hiramatsu for helping me > > realize that). > > > > After converting the cache to an rbtree lookup, I ran kernelshark again on > > opening that file and exited out as soon as it finished loading and the > > timings was: > > > > real 1m22.356s > > user 1m10.532s > > sys 0m10.901s > > > > Still a bit long, but it dropped from over 6 minutes to under 1 1/2 > > minutes. Also, free_zpages() was no longer in the perf record output. > > Does it impact trace-cmd report? Not as drastically as the above, but running this on the same trace.dat file without the patch: $ time trace-cmd report trace.dat > /dev/null real 9m20.390s user 9m16.391s sys 0m3.529s With the patch: $ time trace-cmd report trace.dat > /dev/null real 6m22.935s user 6m19.537s sys 0m3.139s So it does bring it down by a third. I need to send a v2 as I found I left some debugging code in, as well as I found a small bug in the update of the color of the deleted node if it wasn't the node to be deleted. -- Steve