Re: [PATCH 1/2] progress: create progress struct in 'verbose' mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 09, 2020 at 10:17:40PM -0400, Taylor Blau wrote:
> 
> Hi both,
> 
> On Thu, Jul 09, 2020 at 10:00:23PM -0400, Derrick Stolee wrote:
> > On 7/9/2020 9:42 PM, Emily Shaffer wrote:
> > > Before now, the progress API is used by conditionally calling
> > > start_progress() or a similar call, and then unconditionally calling
> > > display_progress() and stop_progress(), both of which are tolerant of
> > > NULL or uninitialized inputs. However, in
> > > 98a136474082cdc7228d7e0e45672c5274fab701 (trace: log progress time and
> > > throughput), the progress library learned to log traces during expensive
> > > operations. In cases where progress should not be displayed to the user
> > > - such as when Git is called by a script - no traces will be logged,
> > > because the progress object is never created.
> 
> This is such a fantastic idea. Just the other day, I was thinking of
> getting your (for clarification, Emily, since I'm responding to
> Stolee's mail) progress-emits-trace2-events work hooked into GitHub's
> trace2 pipeline.
> 
> There were two unfortunate things that prevented this from working:
> 
>   1. GitHub filters which trace2 categories are of interest to us (these
>      interesting ones get logged, and the uninteresting ones get
>      discarded) using an environment variable of comma-separated
>      categories. Since all of the trace2 metrics generated by the
>      progress API don't have categories, taking in one interesting
>      metric meant taking them all in, which is a non-starter for us.
> 
>   2. On top of that, we don't even _generate_ these progress events most
>      of the time, since we're often running without a tty, and so we
>      never end up hitting those 'if (progress) start_progress()'
>      conditionals in the first place.
> 
> If we had something like this, it would reduce the problem to only (1),
> which would make a lot of my headaches go away. (It would also give me a
> good excuse to convert many of our custom trace2 regions into patches on
> the list, and get rid of a non-trivial amount of code that generates
> merge conflicts often).
> 
> > > Instead, to allow us to collect traces from scripted Git commands, teach
> > > a progress->verbose flag, which is specified via a new argument to
> > > start_progress() and friends. display_progress() also learns to filter
> > > for that flag. With these changes, start_progress() can be called
> > > unconditionally but with a conditional as an argument to determine
> > > whether to report progress to the user.
> > >
> > > Since this changes the API, also modify callers of start_progress() and
> > > friends to drop their conditional and pass a new argument in instead.
> 
> I don't think that this is why I was CC'd, but could you perhaps talk a
> little bit about why this is all in the same patch? I don't think this
> change needs to be broken out by the area affected per-se, but the
> current form is a little unruly to review all at once.

I agree that it's a lot, but I didn't think very hard about how to split
it up for review. The problem is that changing the signature of course
breaks the build, and I didn't want a commit which wouldn't build. I
could split it into one commit per start_progress() variation, I
suppose, but is it really easier to review? With only one or two
exceptions these changes are all pretty rote.

 - Emily

> 
> > This is a worthwhile change. Thanks! I was hoping that we would
> > get some of these regions for free, which extends what we can get
> > out of trace2 events.
> >
> > CC'ing Taylor because he had some thoughts on adding a possible
> > trace2 category to make it easier to reason about the regions,
> > when appropriate. Not sure if he's ready to apply that change
> > on top of this series.
> 
> This is what I was talking about above. It would be nice if we could
> somehow teach the 'start_*_progress()' API about a trace2 category.

Yeah, we have been thinking about that too - right now I think we use
the title, which is pretty ugly for metrics as it's intended for human
eyes. One suggestion I heard was to teach an enum to the progress struct
for the region name (or category); but to get it into the region_enter
message we'd have to take it during start_progress(), as you say.

> 
> Unfortunately, this would need to be a new parameter, since we need to
> know the category when we enter the region. So, the API changes might be
> far-reaching. It would be nice if there was a way to limit the blast
> radius (i.e., 'start_progress_trace2(..., category)'), but I haven't
> thought deeply about it.

Yeah, this seems viable in the scenario you describe - those who aren't
using the *_trace2() constructors aren't any worse off than before.

> 
> I don't want to delay this patch series with that. I'd be happy to build
> it in myself on top after this graduates.
> 
> > > diff --git a/builtin/rev-list.c b/builtin/rev-list.c
> > > index f520111eda..f64cad8390 100644
> > > --- a/builtin/rev-list.c
> > > +++ b/builtin/rev-list.c
> > > @@ -620,8 +620,13 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
> > >  	if (bisect_list)
> > >  		revs.limited = 1;
> > >
> > > -	if (show_progress)
> > > -		progress = start_delayed_progress(show_progress, 0);
> > > +	/*
> > > +	 * When progress is not printed to the user, we still want to be able to
> > > +	 * classify the progress during tracing. So, use a placeholder name.
> > > +	 */
> > > +	progress = start_delayed_progress(
> > > +			show_progress ? show_progress : _("Quiet rev-list operation"),
> > > +			0, show_progress != NULL)
> >
> > This is so strange, how we let the command-lines specify a progress
> > indicator. I guess it is necessary when we use rev-list as a
> > subcommand instead of in-process. One such case is check_connected()
> > in connected.c.
> >
> > It's stranger still that "show_progress" is actually a string here,
> > as opposed to being an int in most other places.
> >
> > Your transformation is correct, here, though. Thanks for calling it
> > out in the commit message.
> >
> > >
> > >  	if (use_bitmap_index) {
> > >  		if (!try_bitmap_count(&revs, &filter_options))
> > > diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
> > > index dd4a75e030..719d446916 100644
> > > --- a/builtin/unpack-objects.c
> > > +++ b/builtin/unpack-objects.c
> > > @@ -498,8 +498,7 @@ static void unpack_all(void)
> > >  			ntohl(hdr->hdr_version));
> > >  	use(sizeof(struct pack_header));
> > >
> > > -	if (!quiet)
> > > -		progress = start_progress(_("Unpacking objects"), nr_objects);
> > > +	progress = start_progress(_("Unpacking objects"), nr_objects, !quiet);
> > >  	obj_list = xcalloc(nr_objects, sizeof(*obj_list));
> > >  	for (i = 0; i < nr_objects; i++) {
> > >  		unpack_one(i);
> > > diff --git a/commit-graph.c b/commit-graph.c
> > > index 328ab06fd4..b9a784fece 100644
> > > --- a/commit-graph.c
> > > +++ b/commit-graph.c
> > > @@ -1152,10 +1152,10 @@ static void write_graph_chunk_bloom_indexes(struct hashfile *f,
> > >  	struct progress *progress = NULL;
> > >  	int i = 0;
> > >
> > > -	if (ctx->report_progress)
> > > -		progress = start_delayed_progress(
> > > -			_("Writing changed paths Bloom filters index"),
> > > -			ctx->commits.nr);
> > > +	progress = start_delayed_progress(
> > > +		_("Writing changed paths Bloom filters index"),
> > > +		ctx->commits.nr,
> > > +		ctx->report_progress);
> >
> > There are a lot of blocks like this, where the progress string is long enough to
> > require the first param to be after the method name. Since we are changing the
> > API and every caller, would the resulting code be cleaner if the string value
> > was the last parameter? That would allow this code pattern in most cases:
> >
> > 	progress = start_delayed_progress(count, show_progress,
> > 					  _("My special string!"));
> >
> > Just a thought. Not super-important.
> >
> > The rest of the changes look to be correct.
> >
> > > diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
> > > index 5d05cbe789..19b874f9cd 100644
> > > --- a/t/helper/test-progress.c
> > > +++ b/t/helper/test-progress.c
> > > @@ -23,16 +23,18 @@
> > >  int cmd__progress(int argc, const char **argv)
> > >  {
> > >  	int total = 0;
> > > +	int quiet = 0;
> > >  	const char *title;
> > >  	struct strbuf line = STRBUF_INIT;
> > >  	struct progress *progress;
> > >
> > >  	const char *usage[] = {
> > > -		"test-tool progress [--total=<n>] <progress-title>",
> > > +		"test-tool progress [--total=<n>] [--quiet] <progress-title>",
> > >  		NULL
> > >  	};
> > >  	struct option options[] = {
> > >  		OPT_INTEGER(0, "total", &total, "total number of items"),
> > > +		OPT_BOOL(0, "quiet", &quiet, "suppress stderr"),
> > >  		OPT_END(),
> > >  	};
> > >
> > > @@ -42,7 +44,7 @@ int cmd__progress(int argc, const char **argv)
> > >  	title = argv[0];
> > >
> > >  	progress_testing = 1;
> > > -	progress = start_progress(title, total);
> > > +	progress = start_progress(title, total, !quiet);
> > >  	while (strbuf_getline(&line, stdin) != EOF) {
> > >  		char *end;
> > >
> > > diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
> > > index 1ed1df351c..9d6e6274ad 100755
> > > --- a/t/t0500-progress-display.sh
> > > +++ b/t/t0500-progress-display.sh
> > > @@ -309,4 +309,31 @@ test_expect_success 'progress generates traces' '
> > >  	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
> > >  '
> > >
> > > +test_expect_success 'progress generates traces even quietly' '
> > > +	cat >in <<-\EOF &&
> > > +	throughput 102400 1000
> > > +	update
> > > +	progress 10
> > > +	throughput 204800 2000
> > > +	update
> > > +	progress 20
> > > +	throughput 307200 3000
> > > +	update
> > > +	progress 30
> > > +	throughput 409600 4000
> > > +	update
> > > +	progress 40
> > > +	EOF
> > > +
> > > +	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress --total=40 \
> > > +		--quiet "Working hard" <in 2>stderr &&
> > > +
> > > +	# t0212/parse_events.perl intentionally omits regions and data.
> > > +	grep -e "region_enter" -e "\"category\":\"progress\"" trace.event &&
> > > +	grep -e "region_leave" -e "\"category\":\"progress\"" trace.event &&
> > > +	grep "\"key\":\"total_objects\",\"value\":\"40\"" trace.event &&
> > > +	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
> > > +'
> >
> > Thanks for adding a test, including the resulting trace events!
> >
> > The patch of Taylor's that I mentioned earlier changes the "category"
> > to something a bit more specific than "progress", when appropriate.
> >
> > > +
> > > +
> > >  test_done
> >
> > nit: extra empty line
> >
> > Thanks!
> > -Stolee
> 
> Thanks,
> Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux