On Mon, Jul 13, 2020 at 05:15:04PM -0700, Emily Shaffer wrote: > > On Fri, Jul 10, 2020 at 03:40:25PM -0700, Junio C Hamano wrote: > > > diff --git a/builtin/blame.c b/builtin/blame.c > > > index 94ef57c1cc..1d72b27fda 100644 > > > --- a/builtin/blame.c > > > +++ b/builtin/blame.c > > > @@ -1129,8 +1129,8 @@ int cmd_blame(int argc, const char **argv, const char *prefix) > > > > > > sb.found_guilty_entry = &found_guilty_entry; > > > sb.found_guilty_entry_data = π > > > - if (show_progress) > > > - pi.progress = start_delayed_progress(_("Blaming lines"), sb.num_lines); > > > + pi.progress = start_delayed_progress(_("Blaming lines"), sb.num_lines, > > > + show_progress); > > > > > > assign_blame(&sb, opt); > > > > But there are others that look a bit problematic. In this example > > taken from fsck, we open all the pack index, only because it is > > needed to show the progress, and the existing conditionals are ways > > to avoid spending unneeded cycles. > > That info is still used, when performing trace log during > stop_progress(): > > if (p_progress && *p_progress) { > trace2_data_intmax("progress", the_repository, "total_objects", > (*p_progress)->total); > > if ((*p_progress)->throughput) > trace2_data_intmax("progress", the_repository, > "total_bytes", > (*p_progress)->throughput->curr_total); > > trace2_region_leave("progress", (*p_progress)->title, the_repository); > } > > Because the progress struct is no longer NULL, we can write down the > number of objects calculated to the trace2 log file. So the expense of > calculating it is not wasted, at least in scenarios where someone cares > about the output of the trace log. > > But yes, lots of users don't care about trace log - they are using their > home computer and don't care about metrics like how many objects were > transferred in every single push. Now it becomes a little strange - is > it better to instead wrap each 'struct progress' constructor in "if > (should_i_log || should_i_trace)"? Have we got a simple way to look up > 'should_i_trace'? > > Some later comments have the same point - "why am I calculating the > totals if the user won't see and doesn't care?" - so I won't respond to > each other one; my reply is the same. > > > > > > @@ -836,16 +836,15 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) > > > uint32_t total = 0, count = 0; > > > struct progress *progress = NULL; > > > > > > - if (show_progress) { > > > - for (p = get_all_packs(the_repository); p; > > > - p = p->next) { > > > - if (open_pack_index(p)) > > > - continue; > > > - total += p->num_objects; > > > - } > > > - > > > - progress = start_progress(_("Checking objects"), total); > > > + for (p = get_all_packs(the_repository); p; > > > + p = p->next) { > > > + if (open_pack_index(p)) > > > + continue; > > > + total += p->num_objects; > > > } > > > + > > > + progress = start_progress(_("Checking objects"), total, > > > + show_progress); > > > for (p = get_all_packs(the_repository); p; > > > p = p->next) { > > > /* verify gives error messages itself */ > > > > Likewise, we do not even have to be scanning the index entries > > upfront if we are not showing the progress report (and more > > importantly, the user likely has chosen the speed/cycles over eye > > candy progress meter) while checking out paths from the index. > > > > > diff --git a/unpack-trees.c b/unpack-trees.c > > > index 4be5fc3075..fa84244466 100644 > > > --- a/unpack-trees.c > > > +++ b/unpack-trees.c > > > @@ -338,16 +338,14 @@ static struct progress *get_progress(struct unpack_trees_options *o, > > > { > > > unsigned cnt = 0, total = 0; > > > > > > - if (!o->update || !o->verbose_update) > > > - return NULL; > > > - > > > for (; cnt < index->cache_nr; cnt++) { > > > const struct cache_entry *ce = index->cache[cnt]; > > > if (ce->ce_flags & (CE_UPDATE | CE_WT_REMOVE)) > > > total++; > > > } > > > > > > - return start_delayed_progress(_("Updating files"), total); > > > + return start_delayed_progress(_("Updating files"), total, > > > + (o->update && o->verbose_update)); > > > } > > > > > > static void setup_collided_checkout_detection(struct checkout *state, > > > > > > And there are cases where the "is progress enabled?" parameter (by > > the way, I am not sure 'verbose' is quite the right name for the > > I struggled some with this; I also considered using 'quiet' but it > required inverting the logic everywhere when all callers already were > asking "should I print?" instead of "should I not print?". Maybe "print" > is better? > > > parameter and the field name, as that word implies there can be > > multiple verbosity levels---what I see you've done is that you added > > an "is this enabled?" flag) is not a simple variable, like these. > > Some are simple enough, but some are not. Some of the worst appear > > in preload-index.c and read-cache.c---they may become easier to read > > if we introduced a variable to be passed as the "enable?" parameter > > and compute it before makng a call, and that might be enough, though. > > Sure, makes sense. > > > > diff --git a/read-cache.c b/read-cache.c > > > index aa427c5c17..2ddc422dbd 100644 > > > --- a/read-cache.c > > > +++ b/read-cache.c > > > @@ -1532,9 +1532,9 @@ int refresh_index(struct index_state *istate, unsigned int flags, > > > const char *unmerged_fmt; > > > struct progress *progress = NULL; > > > > > > - if (flags & REFRESH_PROGRESS && isatty(2)) > > > - progress = start_delayed_progress(_("Refresh index"), > > > - istate->cache_nr); > > > + progress = start_delayed_progress(_("Refresh index"), > > > + istate->cache_nr, > > > + (flags & REFRESH_PROGRESS && isatty(2))); > > > > > > trace_performance_enter(); > > > modified_fmt = in_porcelain ? "M\t%s\n" : "%s: needs update\n"; > > > > > > And here is the only one that may benefit from this change, which > > cares that the codepath prepared in rev-list, primarily to support > > the progress meter, is exercised because those pieces of code > > happens to be used by the tracing output as well. Perhaps we > > shouldn't have tied the "when showing progress, we can also trace" > > earlier and we didn't have to touch this many codepath only to spend > > cycles when progress output is not desired? > > I guess I don't see how this is the only caller that benefits. The > tracing output should appear in all the places where I touched the code; > that was the point of the earlier change. Since the object total is provided in > start_progress() and friends, and the runtime is logged by > region_enter() in start_progress() and region_leave() in > stop_progress(), then every caller is now unconditionally leaving traces > to give some impression of how much work and how long it took. So I am > confused - can you explain how the other callers don't benefit? > > > > > > diff --git a/builtin/rev-list.c b/builtin/rev-list.c > > > index f520111eda..f64cad8390 100644 > > > --- a/builtin/rev-list.c > > > +++ b/builtin/rev-list.c > > > @@ -620,8 +620,13 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix) > > > if (bisect_list) > > > revs.limited = 1; > > > > > > - if (show_progress) > > > - progress = start_delayed_progress(show_progress, 0); > > > + /* > > > + * When progress is not printed to the user, we still want to be able to > > > + * classify the progress during tracing. So, use a placeholder name. > > > + */ > > > + progress = start_delayed_progress( > > > + show_progress ? show_progress : _("Quiet rev-list operation"), > > > + 0, show_progress != NULL); > > > > I've cut out the hunks that exhibits similar looking patterns as the > > ones I covered in the above discussion. You are right to say that > > these are simple and similar changes all over the place. I am not > > yet convinced that the simplicity supports that this change is the > > right thing to do, though. The only saving grace might be that > > there aren't that many hunks that explicitly do unnecessary things > > like the ones in unpack-trees.c and builtin/fsck.c even when we are > > not showing progress. > > > > But the other codepaths may be doing conditional computation not > > based on "if (show_progress)" but on "if (progress)", in which case > > Hm, good point. For what it's worth, I performed the changes manually: > > git grep "start_progress()" > for each caller, open in vim > /progress > delete or move if necessary > (repeat until all matches for /progress look right) > > and I did not see cases where 'if (progress)' was being used. That is > hard to express in a diff; I can try to audit it better. (This process > is how I also noticed the things to change in 2/2 in this series, even > though they are unrelated to start_progress()). > > > with this patch, we may be paying a lot more overhead even if we > > know progress meter won't be shown and the worse part from > > reviewability point of view is that this patch does not explicitly > > do anything to make it happen because start_delayed_progress() now > > unconditionally give a non-NULL progress structure to enable them. > > > > So I dunno. It certainly is not for the upcoming release, but we do > > want to make sure that trace requested even when the command is quiet > > gets collected in the longer term (i.e. during the next cycle). > > Ok. It sounds like you are saying you want the spirit of the change, > even if not in this exact form - that is good to know. A month later, are we still interested in this change? I'd like to pursue it - we are trying to use this progress tracing internally to generate some metrics and not having this change (or some version of it) means those metrics aren't accurate. The main concern I saw here was "we are doing a lot of work that isn't used if the user doesn't want to log traces" - should I approach a reroll of this topic by trying to be smarter about whether to set 'quiet' or 'print' or 'verbose' or whatever it is renamed to, based on whether there is a trace destination? Then for systems which are logging traces the extra work is worth it, but for everyone else it can function as before. I don't love it from a design perspective - it feels a little like progress module is looking a little too closely at trace module internals - but from a performance perspective I see the appeal. On the other hand, progress operations are showing progress because they are slow - is a little more work on something that's already human-noticeable slow such a big deal? - Emily