RE: [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code

"Randall S. Becker" <rsbecker@xxxxxxxxxxxxx> · Wed, 23 Jun 2021 16:25:52 -0400

On June 23, 2021 4:02 PM, Ævar Arnfjörð Bjarmason wrote:
>On Wed, Jun 23 2021, Randall S. Becker wrote:
>> On June 23, 2021 1:48 PM, Ævar Arnfjörð Bjarmason wrote:
>>>> On Mon, Jun 21, 2021 at 02:59:53AM +0200, Ævar Arnfjörð Bjarmason wrote:
>>>>>
>>>>> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>>>>>
>>>>> > Splitting off from:
>>>>> >
>>>>> >
>>>>> > https://public-inbox.org/git/cover-0.2-0000000000-20210607T144206
>>>>> > Z-
>>>>> > avarab@xxxxxxxxx/T/#me5d3176914d4268fd9f2a96fc63f4e41beb26bd6
>>>>> >
>>>>> > On Tue, Jun 08, 2021 at 06:14:42PM +0200, René Scharfe wrote:
>>>>> >> I wonder (only in a semi-curious way, though) if we can detect
>>>>> >> off-by-one errors by adding an assertion to display_progress()
>>>>> >> that requires the first update to have the value 0, and in
>>>>> >> stop_progress() one that requires the previous
>>>>> >> display_progress() call to have a value equal to the total
>>>>> >> number of work items.  Not sure it'd be worth the hassle..
>>>>> >
>>>>> > I fixed and reported a number of bogus progress lines in the
>>>>> > past, the last one during v2.31.0-rc phase, so I've looked into
>>>>> > whether progress counters could be automatically validated in our
>>>>> > tests, and came up with these patches a few months ago.  It
>>>>> > turned out that progress counters can be checked easily and
>>>>> > transparently in case of progress lines that are shown in the
>>>>> > tests, i.e. that are shown even when stderr is not a terminal or
>>>>> > are forced with '--progress'.  (In other cases it's still fairly
>>>>> > easy but not quite transparent, as I think we need changes to the
>>>>> > progress API; more on that later in a separate
>>>>> > series.)
>>>>>
>>>>> I've also been working on some progress.[ch] patches that are
>>>>> mostly finished, and I'm some 20 patches in at the moment. I wasn't
>>>>> sure about whether to send an alternate 20-patch "let's do this (mostly) instead?"
>>>>> series, hence this message.
>>>>>
>>>>> Much of what you're doing here becomes easier after that series, e.g.
>>>>> your global process struct in 2/7 is something I ended up
>>>>> implementing as part of a general feature to allow progress to be
>>>>> driven by either display_progress() *or* the signal handler itself.
>>>>
>>>> It's difficult to know who should rebase onto who without seeing one
>>>> half of the patches.
>>>
>>>I was sort of hoping he'd take me word for it, but here it is. Don't
>>>say I didn't warn you :)
>>>
>>>> I couldn't find a link to them anywhere (even if they are only
>>>> available in your fork in a pre-polished state) despite looking, but
>>>> my apologies if they are available and I'm just missing them.
>>>
>>>FWIW it's avar-szeder/progress-bar-assertions in
>>>https://github.com/avar/git.git, that repo contains various functioning and not-so- functioning code.
>>>
>>>https://github.com/avar/git/tree/meta/ is my version of the crappy
>>>scripts we probably all have some version of for building my own git, things that are uncommented in series.conf is what I build my own
>git from.
>>>
>>>> In general, I think that these patches are clear and are helpful in
>>>> pinning down issues with the progress API (which I have made a
>>>> hadnful of times in the past), so I would be happy to see them picked up.
>>>
>>>Here's all 25 patches (well, around 20 before) that I had queued up locally and fixed up a bit.
>>>
>>>The 01/25 is something I submitted already as
>>>https://lore.kernel.org/git/patch-1.1-cba5d88ca35-20210621T070114Z-ava
>>>rab@xxxxxxxxx;
>>>hoping to get this in incrementally.
>>>
>>>The 12/25 is my own version of that "global progress struct, 11/25 is
>>>the first of many bugs SZEDER missed in his :)
>>>
>>>18/25 is the first step of the UI I was going for, the signal handler
>>>can now drive the progress bar, so e.g. during "git gc" we show (at least for me, on git.git), a "stalled" message just before we start the
>actual count of "Enumerating Objects".
>>>
>>>After that was in I was planning on adding config-driven support to
>>>show a "spinner" when we stalled in that way, config-driven because
>>>you could just scrape e.g.
>>>https://github.com/sindresorhus/cli-spinners/blob/main/spinners.json
>>>into your own config. See
>>>https://jsfiddle.net/sindresorhus/2eLtsbey/embedded/result/ :)
>>>
>>>19-23/25 is my grabbing of SZEDER's patches that I'm comfortable
>>>labeling as "PATCH", I think they work, but no BUG() assertions yet. I left out the GIT_TEST_CHECK_PROGRESS parts, since my earlier
>works set things up to do any BUG() we trust by default.
>>>
>>>22/25 is what I think we should do instead of SZEDER's 6/7
>>>(http://lore.kernel.org/git/20210620200303.2328957-7-szeder.dev@gmail.
>>>com) I don't think this "our total doesn't match at the end" is
>>>something we should always BUG() on, for reasons explained there.
>>>
>>>I am sympathetic to doing it by default though, hence the
>>>stop_progress_early() API, that's there to allow select callers to bypass his BUG(...) assertion.
>>>
>>>24/25 and 25/25 are "RFC" and a rebased+modified version of SZEDER's
>>>BUG(...) assertions.
>>>
>>>His series passes the test suite, but actually severely break things
>>>things. It'll make e.g. "git commit-graph write" BUG(...) out. The reason the tests don't catch it is because we have a blind spot in the
>tests.
>>>
>>>Namely, that most things that use the progress bar API use isatty() to check if they should start_progress(). If you run the tests as e.g.
>>>(better ways to do this, especially in parallel, most welcome):
>>>
>>>    for t in t[0-9]*.sh; do if ! ./$t -vixd; then echo $t bad; break;
>>> fi; done
>>>
>>>You can discover various things that his series BUG()'s on, I fixed a couple of those myself, it's an early part of this series.
>>>
>>>But we'll still have various untested for BUG()'s even then, this is because you *also* have to have the test actually emit a "naked"
>>>progress bar on stderr, if the test itself e.g. pipes fd 2 to a file it won't work.
>>>
>>>I created a shitty-and-mostly-broken throwaway change to
>>>search-replace all the guards of "start_progress(...)" to run
>>>unconditionally, and convert all the "delayed" to the non-delayed version. That'll find even more BUG()'s where SZEDER's series still
>needs to be fixed (and also some unrelated segfaults, I gave up on it soon after).
>>>
>>>Even if we fix that I wouldn't trust it, because a lot of the progress
>>>bars we have depend on the size and shape of the data we're
>>>processing, e.g. the bug I fixed in 11/25. If people find this BUG() approach worth pursuing I think it would be better to make it an opt-in
>flag we convert one caller at a time to.
>>>
>>>For some it's really clear that we could assert it, for others such as
>>>the commit-graph it's much more subtle, we're in some callback after setting a "total", that callback does a "break", "continue" etc. in
>various places, all depending on repository data.
>>>
>>>It's not easy to reason about that and be certain that we can hold to
>>>the estimate. If we get it wrong someone's repo in the wild won't fully GC because of the overly eager BUG().
>>>
>>>If SZEDER wants to pursue it I think it'll be easier on top of this series, but personally I really don't see the point of spending effort on it.
>>>
>>>We should really be going in the other direction, of having more fuzzy ETAs, not less.
>>>
>>>E.g. we often have enough data at the start of "Enumerating Objects"
>>>to give a good-enough target value, that it's 5-10% off isn't really
>>>the point, but that the user looking at it sees something better than
>>>a dumb count-up, and can instead see that they'll probably be looking at it for about a minute. Now our API is to give no ETA/target if
>we're not 100% sure, it's not good UX.
>>>
>>>So trying to get the current exact count/exact percentage right seems
>>>like a distraction to me in the longer term. If anything we should just be rounding those numbers, showing fuzzy ETAs instead of
>percentages if we can etc.
>>>
>>>SZEDER Gábor (4):
>>>  commit-graph: fix bogus counter in "Scanning merged commits" progress
>>>    line
>>>  entry: show finer-grained counter in "Filtering content" progress
>>>line
>>>  progress: assert last update in stop_progress()
>>>  progress: assert counting upwards in display()
>>>
>>>Ævar Arnfjörð Bjarmason (21):
>>>  progress.c tests: fix breakage with COLUMNS != 80
>>>  progress.c tests: make start/stop verbs on stdin
>>>  progress.c tests: test some invalid usage
>>>  progress.c tests: add a "signal" verb
>>>  progress.c: move signal handler functions lower
>>>  progress.c: call progress_interval() from
>>>progress_test_force_update()
>>>  progress.c: stop eagerly fflush(stderr) when not a terminal
>>>  progress.c: add temporary variable from progress struct
>>>  midx perf: add a perf test for multi-pack-index
>>>  progress.c: remove the "sparse" mode nano-optimization
>>>  pack-bitmap-write.c: add a missing stop_progress()
>>>  progress.c: add & assert a "global_progress" variable
>>>  progress.[ch]: move the "struct progress" to the header
>>>  progress.[ch]: move test-only code away from "extern" variables
>>>  progress.c: pass "is done?" (again) to display()
>>>  progress.[ch]: convert "title" to "struct strbuf"
>>>  progress.c: refactor display() for less confusion, and fix bug
>>>  progress.c: emit progress on first signal, show "stalled"
>>>  midx: don't provide a total for QSORT() progress
>>>  progress.c: add a stop_progress_early() function
>>>  entry: deal with unexpected "Filtering content" total
>>>
>>> cache.h                          |   1 -
>>> commit-graph.c                   |   2 +-
>>> csum-file.h                      |   2 -
>>> entry.c                          |  12 +-
>>> midx.c                           |  25 +-
>>> pack-bitmap-write.c              |   1 +
>>> pack.h                           |   1 -
>>> parallel-checkout.h              |   1 -
>>> progress.c                       | 391 ++++++++++++++++++-------------
>>> progress.h                       |  50 +++-
>>> reachable.h                      |   1 -
>>> t/helper/test-progress.c         |  54 +++--
>>> t/perf/p5319-multi-pack-index.sh |  21 ++
>>> t/t0500-progress-display.sh      | 247 ++++++++++++++-----
>>> 14 files changed, 537 insertions(+), 272 deletions(-)  create mode
>>> 100755 t/perf/p5319-multi-pack-index.sh
>>
>> Is there provision for disabling progress on a per-command basis? My
>> use case is specifically in a CI/CD script, being able to suppress
>> progress handling. The current Jenkins plugin does not appear to have
>> provision for hooking into a mechanism, which makes things get a bit
>> wonky when a job runs with a pseudo-tty (as provided by Jenkins
>> through SSH/RMI).
>> -Randall
>
>There isn't, some commands support --no-progress, but it's hit and miss.
>
>You can then set the undocumented GIT_PROGRESS_DELAY=99999999 (or some really big number) to suppress more of them.
>
>We could just add it as a top-level "git --no-progress" option I suppose...
>
>Probably better would be to detect such not-a-terminals somehow, I think at some point our own gc.log was a victim of this.

I think a global not-a-terminal would be best here. It does not make a lot of sense to dump progress on a device that does not handle Control-M. I think I recall someone recently saying that we should be detecting this.