On Mon, May 20 2019, Jeff Hostetler wrote: > On 5/10/2019 5:57 PM, Ævar Arnfjörð Bjarmason wrote: >> >> On Fri, May 10 2019, Jeff King wrote: >> >>> On Fri, May 10, 2019 at 05:09:58PM +0200, Ævar Arnfjörð Bjarmason wrote: >>> >>>> As noted in TODOs in the script there's various stuff I'd like to do >>>> better, and this also shows how we need a lot more trace regions to get >>>> granular data. >>> >>> Hmm. My gut reaction was: doesn't "perf record -g make test" already >>> give us that granular data? I know "perf" isn't available everywhere, >>> but the idea of the FlameGraph repo is that it takes input from a lot of >>> sources (though I don't know if it supports any Windows-specific formats >>> yet, which is presumably a point of interesting to trace-2 authors). >>> >>> But having generated such a flamegraph, it's not all that helpful. It >>> mainly tells us that we spend a lot of time on fork/exec. Which is no >>> surprise, since the test suite is geared not towards heavy workloads, >>> but lots of tiny functionality tests. >>> >>> TBH, I'm not sure that flame-graphing the test suite is going to be all >>> that useful in the long run. It's going to be heavily weighted by the >>> types of things the test suite does. Flamegraphs are good for >>> understanding where your time is going for a particular workload, but >>> the workload of the test suite is not that interesting. >>> >>> And once you do have a particular workload of interest that you can >>> replay, then I think the granular "perf" results really can be helpful. >>> >>> I think the trace2 flamegraph would be most useful if you were >>> collecting across a broad spectrum of workloads done by a user. You >>> _can_ do that with perf or similar tools, but it can be a bit awkward. >>> I do wonder how painful it would be to alias "git" to "perf record git" >>> for a day or something. >> >> Yeah I should have mentioned that I'm mainly linking to the test suite >> rendering as a demo. >> >> My actual use-case for this is to see what production nodes are spending >> their time on, similar to what Microsoft is doing with their use of this >> facility. >> >> The test suite serves as a really good test-case for the output, and to >> stress-test my aggregation script, since we're pretty much guaranteed to >> run all our commands, and cover a lot of unusual cases. >> >> It also shows that we've got a long way to go in improving the trace2 >> facility, i.e. adding region enter/leave for some of the things we spend >> the most time on. >> > > Very nice! > > Yes, there is more work to do to add more regions to get more > granular data for interesting/problematic things. My primary > goal in this phase has been to get the basic machinery in place > and be vetted with some universally interesting regions, such as > reading/writing the index and the phases of status. > > Going forward, we can trivially (permanently) add new regions as we > want. I tend to use temporary "experimental" regions during my perf > investigations so that I don't clutter up the mainline source with > uninteresting noise. Indeed, a lot more regions are needed. > WRT the TODO's in your script: > > [] I don't think data events will be useful for your usage. The data > values are orthogonal to the time values. I haven't done this, so I'm not asserting that it's useful, but from some brief grepping a few datapoints are overwhelmingly common, and can be faked up into regions of sorts for the purposes of a flamegraph. E.g. "git checkout" will reliably have read/version early on, and then write/version, in that case mostly/entirely redundant to the do_{read,write}_index region, but in general I think we'll be able to loosely plot data points say as "given the median runtime, here's the median % of time into the command we first encounter this data point". > [] I would add the child_start/_exit events to the stack. This will > give you the names of non-builtin/shell commands and hooks. The > various "child_class" and "use_shell" and "hook_name" fields will help > you avoid duplicate stack frames (which you'll get for builtin > commands). Yeah that's very useful. Any reason not to do something like this: diff --git a/git.c b/git.c index 1bf9c94550..6c926ae013 100644 --- a/git.c +++ b/git.c @@ -698 +698 @@ static void execv_dashed_external(const char **argv) - trace2_cmd_name("_run_dashed_"); + trace2_cmd_name(is_builtin(argv[0]) ? argv[0] : "_run_dashed_"); I haven't tested, but we e.g. report 'git-submodule' as just '_run_dashed_', seems we could do better...