On 1/14/2019 8:05 PM, Jonathan Nieder wrote:
Hi,
Jeff Hostetler wrote:
This patch series contains a new trace2 facility that hopefully addresses
the recent trace- and structured-logging-related discussions. The intent is
to eventually replace the existing trace_ routines (or to route them to the
new trace2_ routines) as time permits.
I've been running with these patches since last October. A few
thoughts:
I like the API.
Great, thanks. Hopefully you're getting some good/actionable data from
it.
The logs are a bit noisy and especially wide. For my use, the
function name is not too important since we can get that from the file
and line number. Should we have a way to omit some fields, or is that
for post-processing?
Yes, the events are a little wide and noisy, at least in this draft.
Part of this is to flesh out the trace2 API (which should be relatively
fixed) and make sure we have enough event types to emit useful
information. This is independent of some of the detail events (like
region/data events within status or index reading/writing). Some of
those detail events might be kept if they're useful or temporary
demonstration events or events you could include in a private build for
a limited period of time. So some of the noise might be those
demonstration events (stuff that you'd want for testing in a perf view,
but not need archived, for example).
Also, for the events that have a "category" field, I'd eventually
like to have a filter setting to include/omit them. This is something
like the GIT_TRACE_<name> feature we currently have, but limited to
always writing to the same file. I had this in an earlier version,
but haven't brought it over yet.
And yes, I have a post-processing step that filters fields and
generates a summary record for each process instance. My previous
draft tried to do that summary inside the git.exe process and it was
suggested that we move that out, so this version emits the raw data
as it occurs and I get the summary after the fact. This has turned
out nicely, even if the Trace2 stream is a little noisy.
There are some fields that I'd like to omit from my JSON stream that
I'm not using in my summary, such as the filename and line number.
These got carried along since the PERF view needed them. I think they
make sense in the PERF view, but not so much in the EVENT view.
I'm filtering them out in my post-processing, but I think we could
just omit them.
We don't find the JSON easy to parse and would prefer a binary format.
I'm going to have to push back a little on this one. JSON is easy to
process in PERL, C#, various databases, and etc. Processing a non-text
format in bash is just asking for pain and suffering.
Can you elaborate on the problems you're having with JSON?
When you say "binary" what kind of binary do you mean? Is this BSON?
Or are you suggesting protocol buffers? If the latter, is there a C
binding for that? (Every example I've seen talks about C++.)
In my gvfs-trace2-v4 branch, I've refactored the code and now have
a vtable-like mechanism that allows multiple Trace2 "targets" to be
defined. See trace2/tr2_tgt_perf.c vs trace2/tr2_tgt_events.c. The
former generates the GIT_TR2_PERF view and the latter generates the
JSON event view.
You could add a self-contained target vtable that generates a binary
view if you wanted. (Just let it key off of a different GIT_TR2_
environment variable.)
When I apply the patches, Git complains about whitespace problems
(trailing whitespace, etc).
Aside from that kind of easily correctible issue (trailing
whitespace), I'd be in favor of taking these patches pretty much as-is
and making improvements in tree. Any objections to that, or do you
have other thoughts on where this should go?
If that sounds reasonable to you, I can send a clean version of these
based against current "master". If I understand correctly, then
https://github.com/jeffhostetler/git
branch
gvfs-trace2-v4
contains some improvements, so as a next step I'd try to extract those
as incremental patches on top. What do you think?
Thanks,
Jonathan
The gvfs-trace2-v4 version has lots of improvements over the version
I last posted on the mailing list. We should go with it.
I'm not surprised that there are merge conflicts, since mine is based
upon the recent GVFS release and has some gvfs-specific commits in it.
Let me rebase that branch onto the upstream/master and clean up the
mess and send out another patch set.
Hopefully, I can get that out tomorrow.
Jeff