On Wed, May 25 2022, Johannes Schindelin wrote: > Hi Ævar, > > as promised in the Git IRC Standup [*1*], a review. > > On Wed, 18 May 2022, Ævar Arnfjörð Bjarmason wrote: > >> Ævar Arnfjörð Bjarmason (8): >> run-command tests: change if/if/... to if/else if/else >> run-command API: use "opts" struct for run_processes_parallel{,_tr2}() >> run-command tests: test stdout of run_command_parallel() >> run-command.c: add an initializer for "struct parallel_processes" >> run-command: add an "ungroup" option to run_process_parallel() >> hook tests: fix redirection logic error in 96e7225b310 >> hook API: don't redundantly re-set "no_stdin" and "stdout_to_stderr" >> hook API: fix v2.36.0 regression: hooks should be connected to a TTY > > I started reviewing the patches individually, but have some higher-level > concerns that put my per-patch review on hold. > > Keeping in mind that the intention is to fix a regression that was > introduced by way of refactoring (most of our recent regressions seem to > share that trait [*2*]), I strongly advise against another round of > refactoring [*3*], especially against tying it to fix a regression. > > In this instance, it would be very easy to fix the bug without any > refactoring. In a nutshell, the manifestation of the bug amplifies this > part of the commit message of 96e7225b310 (hook: add 'run' subcommand, > 2021-12-22): > > Some of the implementation here, such as a function being named > run_hooks_opt() when it's tasked with running one hook, to using the > run_processes_parallel_tr2() API to run with jobs=1 is somewhere > between a bit odd and and an overkill for the current features of this > "hook run" command and the hook.[ch] API. > > It is this switch to `run_processes_parallel()` that is the root cause of > the regression. Yes, or more generally to the new hook API which makes use of it. > The current iteration of the patch series does not fix that. Because the plan is still to continue in this direction and go for Emily's config-based hooks, which will run in parallel. And to fix that would at this point be a larger functional change, because we'd be running with more code we haven't tested before, i.e. hook.[ch] on some new backend. So just passing down the appropriate flags to have run-command.[ch] do the right thing for us seemed to be the least bad option. > In the commit message from which I quoted, the plan is laid out to > eventually run more than one hook. If that is still the plan, we will be > presented with the unfortunate choice to either never running them in > parallel, or alternatively reintroducing the regression where the hooks > run detached from stdin/stdout/stderr. No, because you can have N processes all connected to a terminal with "ungroup", what you can't do is guarantee that they won't interleave. But as discussed in some previous threads that would be OK, since that would come as an opt-in to parallel hook execution. I.e. you could pick one of: 1. Current behavior 2. Our parallel hook execution (whatever "ungroup" etc. settings that entails) 3. Your own parallel hook execution It only matters that we don't regress in #1, for #2 we could have different behavior, but just document the caveats as such. IOW it's OK if you run parallel hooks and we decide that they won't be connected to a terminal, because that's a new feature we don't have yet, one you'd need to opt into. > It is pretty clear that there is no actual choice, and the hooks will > never be able to run in parallel. Therefore, the fix should move > `run_hooks_opt()` away from calling `run_processes_parallel()`. They will run in parallel, see above. > In any case, regression fixes should not be mixed with refactorings unless > the latter make the former easier, which is not the case here. I noted upthread/side-thread (in any case, in discussions around this) that I wished I'd come up with something smaller, but couldn't. If you want to try your hand at that I'd love to see it. But basically migrating the hook API to a new "backend" would also be a large change, so would making the bare minumum change in run-command.[ch]. But hey, I might be wrong. So if you think it's obvious that this could be much smaller I'd love to see patches for it... > Footnote *1*: > https://colabti.org/irclogger/irclogger_log/git-devel?date=2022-05-23#l44 > > Footnote *2*: I say "seem" because it would take a proper retro to analyze > what was the reason for the uptick in regressions, and even more > importantly to analyze what we can learn from the experience. Yes, that might be interesting. I'll only note that I think you're focusing on the wrong thing here with "refactorings". If you look at the history of this hooks API topic it started early on with a version where the config-based hooks + parallelism (currently not here yet) were combined with making the existing hook users use the new API (partially here now). Now, I suggested that be split up so that we'd first re-implement all existing hooks on the new API, and *then* perform any feature changes. Except of course by doing so that alters the nature of those changes in your definition, I assume, i.e. it goes from a feature series to "refactorings". Whereas I think the important thing to optimize for is to make smaller incremental changes. Here we had a bug, and it's relatively easy to fix it, it would be much harder if we had a bigger delta in v2.36 with not only this bug, but some other regressions. Which isn't hypothetical b.t.w., until 3-4 months ago nobody had seen that the config-based hooks topic we had kicking around had a severe performance regression. I found it and Emily & I have been kicking around a fix for it (mostly off-list). But if we'd done that we'd have a more broken release, but we also wouldn't have "refactorings". I.e. the run_parallel API would actually be used, but we'd have this breakage plus some others. Anyway, I think there's lots of things we could probably do better in delivering more reliable software. I'm just pointing out that here that I think focusing on a part of a larger progression from A..B and saying that it refactored something as being bad is to make a categorical mistake. Because a re-doing of that state to make each step not have any of those would result in larger change deltas. > Footnote *3*: The refactoring, as Junio suspected, might very well go a > bit over board. Even if a new variation of the `run_processes_parallel()` > function that takes a struct should be necessary, it would be easy -- and > desirable -- to keep the current function signatures unchanged and simply > turn them into shims that then call the new variant. That would make the > refactoring much easier to review, and in turn it would make it less > likely to introduce another regression. Sure, we could instead add a third variant to it in addition to the two on "master", instead of unifying them as is done here. But per the v1 feedback the consensus seemed to be that this was a good direction, and to the extent that there were objections it was that I should add the rest of the arguments to the "opts" struct. But again, I'm fully open to that, I tried that and didn't think the end result was any simpler to review, but perhaps you'd like to try...