Re: [PATCH v4 2/5] unpack-trees: add performance tracing

Jeff King <peff@xxxxxxxx> · Tue, 14 Aug 2018 15:54:56 -0400

On Tue, Aug 14, 2018 at 08:51:41PM +0200, Duy Nguyen wrote:

> > But AFAICT these series are all about putting the sampling points into the
> > code base, so formatting would be orthogonal to it?
> 
> It's not just sampling points. There's things like index id being
> shown in the message for example. I prefer to keep free style format
> to help me read. There's also things like indentation I do here to
> help me read. Granted you could do all that with scripts and stuff,
> but will we pass around in mail  dumps of json messages to be decoded
> locally?

I think you could have both forms using the same entry points sprinkled
through the code.

At GitHub we have a similar telemetry-ish thing, where we collect some
data points and then the resulting JSON is stored for every operation
(for a few weeks for read ops, and indefinitely attached to every ref
write).

And I've found that the storage and the trace-style "just show a
human-readable message to stderr" interface complement each other in
both directions:

 - you can output a human readable message that is sent immediately to
   the trace mechanism but _also_ becomes part of the telemetry. E.g.,
   imagine that one item in the json blob is "this is the last message
   from GIT_TRACE_FOO". Now you can push tracing messages into whatever
   plan you're using to store SLOG. We do this less with TRACE, and much
   more with error() and die() messages.

 - when a structured telemetry item is updated, we can still output a
   human-readable trace message with just that item. E.g., with:

     trace_performance(n, "foo");

   we could either store a json key (perf.foo=n) or output a nicely
   formatted string like we do now, depending on what the user has
   configured (or even both, of course).

It helps if the sampling points give enough information to cover both
cases (as in the trace_performance example), but you can generally
shoe-horn unstructured data into the structured log, and pretty-print
structured data.

-Peff