On Thu, Oct 07 2021, Emily Shaffer wrote: > On Tue, Sep 21, 2021 at 01:09:29AM +0200, Ævar Arnfjörð Bjarmason wrote: >> >> The progress.c code makes a hard assumption that only one progress bar >> be active at a time (see [1] for a bug where this wasn't the >> case). Add a BUG() that'll trigger if we ever regress on that promise >> and have two progress bars active at the same time. >> >> There was an alternative test-only approach to doing the same >> thing[2], but by doing this outside of a GIT_TEST_* mode we'll know >> we've put a hard stop to this particular API misuse. >> >> It will also establish scaffolding to address current fundamental >> limitations in the progress output: The current output must be >> "driven" by calls to the likes of display_progress(). Once we have a >> global current progress object we'll be able to update that object via >> SIGALRM. See [3] for early code to do that. >> >> It's conceivable that this change will hit the BUG() condition in some >> scenario that we don't currently have tests for, this would be very >> bad. If that happened we'd die just because we couldn't emit some >> pretty output. >> >> See [4] for a discussion of why our test coverage is lacking; our >> progress display is hidden behind isatty(2) checks in many cases, so >> the test suite doesn't cover it unless individual tests are run in >> "--verbose" mode, we might also have multi-threaded use of the API, so >> two progress bars stopping and starting would only be visible due to a >> race condition. >> >> Despite that, I think that this change won't introduce such >> regressions, because: >> >> 1. I've read all the code using the progress API (and have modified a >> large part of it in some WIP code I have). Almost all of it is really >> simple, the parts that aren't[5] are complex in the display_progress() part, >> not in starting or stopping the progress bar. >> >> 2. The entire test suite passes when instrumented with an ad-hoc >> Linux-specific mode (it uses gettid()) to die if progress bars are >> ever started or stopped on anything but the main thread[6]. >> >> Extending that to die if display_progress() is called in a thread >> reveals that we have exactly two users of the progress bar under >> threaded conditions, "git index-pack" and "git pack-objects". Both >> uses are straightforward, and they don't start/stop the progress >> bar when threads are active. >> >> 3. I've likewise done an ad-hoc test to force progress bars to be >> displayed with: >> >> perl -pi -e 's[isatty\((?:STDERR_FILENO|2)\)][1]g' $(git grep -l 'isatty\((STDERR_FILENO|2)\)') > > I think your ad-hoc test might be a little more compelling if it was > easier to understand, which is to say, maybe if your Perl oneliner was > on more than one line, or had comments, or was in a different language. > Although you explain it right after, we kind of have to take your word > for it. I'll see if I can use sed or something, which is easy enough in this case. I just write Perl out of habit for this sort of thing (e.g. balanced braces & Perl-regexes make it much nicer). >> >> I.e. to replace all checks (not just for progress) of checking >> whether STDERR is connected to a TTY, and then monkeypatching >> is_foreground_fd() in progress.c to always "return 1". Running the >> tests with those applied, interactively and under -V reveals via: >> >> $ grep -e set_progress_signal -e clear_progress_signal test-results/*out >> >> That nothing our tests cover hits the BUG conditions added here, >> except the expected "BUG: start two concurrent progress bars" test >> being added here. >> >> That isn't entirely true since we won't be getting 100% coverage >> due to cascading failures from tests that expected no progress >> output on stderr. To make sure I covered 100% I also tried making >> the display() function in progress.c a NOOP on top of that (it's >> the calls to start_progress_delay() and stop_progress()) that >> matter. >> >> That doesn't hit the BUG() either. Some tests fail in that mode >> due to a combination of the overzealous isatty(2) munging noted >> above, and the tests that are testing that the progress output >> itself is present (but for testing I'd made display() a NOOP). >> >> Between those three points I think it's safe to go ahead with this >> change. > > One worry I had was that we might be painting ourselves into a corner > here if we did want to support the ability to do multiple progress bars > simultaneously (for example if we want to pull from multiple CDNs at the > same time when we're using promisor packfiles, and we expect those packs > to be large enough that we'd need to show a progress bar for each one). > However, I think the pattern - hang onto a pointer to the progress > objects, and complain if we get a signal and there are any still valid - > still holds well enough, so I'm ok with this change. Yes, and that's a thing I'd really like the progress code to be able to do too, and I've got some follow-up patches, but (somewhat paradoxically) in order to display multiple progress bars you need to first have a step like this to ensure that there is only ever one progress bar. The user only has one terminal, so we'll need to serialize our N progress bars to one "emitter", we'll need to teach the progress accounting to either have N parallel progress lines, or to simply make N number of "slave" "struct progress *" hang off it. I'm leaning towards the latter. > There are a couple patches in the middle which I didn't reply to, but I > did read them, and they were so tiny and mechanical that I did not have > useful comments to add. > > Thanks, it's nice to see progress here (ha ha ha). :) > Preferably with the BUG() message nit below, > Reviewed-by: Emily Shaffer <emilyshaffer@xxxxxxxxxx> > >> >> 1. 6f9d5f2fda1 (commit-graph: fix progress of reachable commits, 2020-07-09) >> 2. https://lore.kernel.org/git/20210620200303.2328957-3-szeder.dev@xxxxxxxxx >> 3. https://lore.kernel.org/git/patch-18.25-e21fc66623f-20210623T155626Z-avarab@xxxxxxxxx/ >> 4. https://lore.kernel.org/git/cover-00.25-00000000000-20210623T155626Z-avarab@xxxxxxxxx/ >> 5. b50c37aa44d (Merge branch 'ab/progress-users-adjust-counters' into >> next, 2021-09-10) >> 6. https://lore.kernel.org/git/877dffg37n.fsf@xxxxxxxxxxxxxxxxxxx/ >> >> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> >> --- >> progress.c | 17 +++++++++++++---- >> t/t0500-progress-display.sh | 11 +++++++++++ >> 2 files changed, 24 insertions(+), 4 deletions(-) >> >> diff --git a/progress.c b/progress.c >> index 1ab7d19deb8..14a023f4b43 100644 >> --- a/progress.c >> +++ b/progress.c >> @@ -46,6 +46,7 @@ struct progress { >> }; >> >> static volatile sig_atomic_t progress_update; >> +static struct progress *global_progress; >> >> /* >> * These are only intended for testing the progress output, i.e. exclusively >> @@ -221,11 +222,15 @@ void progress_test_force_update(void) >> progress_interval(SIGALRM); >> } >> >> -static void set_progress_signal(void) >> +static void set_progress_signal(struct progress *progress) >> { >> struct sigaction sa; >> struct itimerval v; >> >> + if (global_progress) >> + BUG("should have no global_progress in set_progress_signal()"); >> + global_progress = progress; > > Can we make the BUG() message a little clearer? Even in the context of > the code, it's not clear that what this BUG() really means is "hey, you > forgot to call stop_progress on something" or "hey, you can't have two > progress bars at the same time". Even if you were to change the name of > 'global_progress' to 'existing_progress_bar' or something, I think that > would make the message more understandable. Willdo, thanks. >> + >> if (progress_testing) >> return; >> >> @@ -243,10 +248,14 @@ static void set_progress_signal(void) >> setitimer(ITIMER_REAL, &v, NULL); >> } >> >> -static void clear_progress_signal(void) >> +static void clear_progress_signal(struct progress *progress) >> { >> struct itimerval v = {{0,},}; >> >> + if (!global_progress) >> + BUG("should have a global_progress in clear_progress_signal()"); >> + global_progress = NULL; >> + >> if (progress_testing) >> return; >> >> @@ -270,7 +279,7 @@ static struct progress *start_progress_delay(const char *title, uint64_t total, >> strbuf_init(&progress->counters_sb, 0); >> progress->title_len = utf8_strwidth(title); >> progress->split = 0; >> - set_progress_signal(); >> + set_progress_signal(progress); >> trace2_region_enter("progress", title, the_repository); >> return progress; >> } >> @@ -374,7 +383,7 @@ void stop_progress_msg(struct progress **p_progress, const char *msg) >> display(progress, progress->last_value, buf); >> free(buf); >> } >> - clear_progress_signal(); >> + clear_progress_signal(progress); >> strbuf_release(&progress->counters_sb); >> if (progress->throughput) >> strbuf_release(&progress->throughput->display); >> diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh >> index ffa819ca1db..124d33c96b3 100755 >> --- a/t/t0500-progress-display.sh >> +++ b/t/t0500-progress-display.sh >> @@ -296,6 +296,17 @@ test_expect_success 'cover up after throughput shortens a lot' ' >> test_cmp expect out >> ' >> >> +test_expect_success 'BUG: start two concurrent progress bars' ' >> + cat >in <<-\EOF && >> + start 0 one >> + start 0 two >> + EOF >> + >> + test_must_fail test-tool progress \ >> + <in 2>stderr && >> + grep -E "^BUG: .*: should have no global_progress in set_progress_signal\(\)$" stderr >> +' >> + >> test_expect_success 'progress generates traces' ' >> cat >in <<-\EOF && >> start 40 >> -- >> 2.33.0.1098.gf02a64c1a2d >>