Hi Ævar Thank you for condensing the patch series into something more palatable for reviewers. The general consensus from the review club yesterday (we looked at v2) was that it was difficult to follow what patches were relevant to your original intention and what patches were auxiliary QOL changes. Also having too many intermediary patches where you add variables/functions that were later deleted made it hard to visualize which parts of the patch would end up making it into the final state. I appreciate the "show your work" approach you take, but I think that approach is better suited for more difficult patch series where there are significant/difficult to understand functionality changes. In this case, the end state of a refactor is clear and there are no functionality changes so I believe a more condensed series would be easier to review. That being said, I don't believe there is a need to spend more time trying to condense this series so future reviewers have an easier time -- the end result and intentions are enough. Reviewed-by: Calvin Wan <calvinwan@xxxxxxxxxx> On Wed, Oct 12, 2022 at 2:03 PM Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote: > > For a general overview see the v2 CL: > https://lore.kernel.org/git/cover-v2-00.22-00000000000-20221012T084850Z-avarab@xxxxxxxxx/ > > Changes since v2: > > * Ejected various not-to-the-point of converting to the "opts struct" > per feedback about attempting to make this leaner. > * I kept the size_t change & the online_cpus() fallback, and updated > a commit message for the latter. For "int" v.s. "size_t" once we're > not handling "-1" to mean "use the default" it makes sense to be > unsigned, and if we're re-doing that at this point we'd need > rewrites for "!processes" assumptions. > * Squashed the multi-step introduction of the "opts" struct, per > Phillip's suggestion. > * Fixed a segfault in the v2's 22/22 (this 15/15). > * Got rid of superfluous unsigned conversions of code related to the > "env" member. > > Ævar Arnfjörð Bjarmason (15): > run-command test helper: use "else if" pattern > run-command API: have "run_processes_parallel{,_tr2}()" return void > run-command tests: use "return", not "exit" > run-command API: make "n" parameter a "size_t" > run-command API: don't fall back on online_cpus() > run-command.c: use designated init for pp_init(), add "const" > run-command API: have run_process_parallel() take an "opts" struct > run-command API: move *_tr2() users to "run_processes_parallel()" > run-command.c: make "struct parallel_processes" const if possible > run-command.c: don't copy *_fn to "struct parallel_processes" > run-command.c: don't copy "ungroup" to "struct parallel_processes" > run-command.c: don't copy "data" to "struct parallel_processes" > run-command.c: use "opts->processes", not "pp->max_processes" > run-command.c: pass "opts" further down, and use "opts->processes" > run-command.c: remove "max_processes", add "const" to signal() handler > > builtin/fetch.c | 25 ++-- > builtin/submodule--helper.c | 16 ++- > hook.c | 23 ++-- > run-command.c | 236 ++++++++++++++++-------------------- > run-command.h | 71 ++++++++--- > submodule-config.c | 2 + > submodule.c | 18 ++- > t/helper/test-run-command.c | 77 +++++++----- > t/t5526-fetch-submodules.sh | 8 +- > 9 files changed, 268 insertions(+), 208 deletions(-) > > Range-diff against v2: > 1: bc51dfcb1be < -: ----------- hook tests: fix redirection logic error in 96e7225b310 > 2: 3027f5587a7 < -: ----------- submodule tests: reset "trace.out" between "grep" invocations > 3: c4923358bbd < -: ----------- run-command tests: test stdout of run_command_parallel() > 4: 26e28086252 = 1: d3a2489d9b2 run-command test helper: use "else if" pattern > 5: 5e09dc68fd9 = 2: a2e4fd652c1 run-command API: have "run_processes_parallel{,_tr2}()" return void > 6: e4e91dbbf9e = 3: 4a19de65783 run-command tests: use "return", not "exit" > 7: b90961ae76d < -: ----------- run-command.c: remove dead assignment in while-loop > 8: 279b0430c5d < -: ----------- run-command.c: use C99 "for (TYPE VAR = ..." syntax where useful > 9: a900711270c ! 4: 58018a79b2f run-command API: make "n" parameter a "size_t" > @@ Commit message > > make run-command.o DEVOPTS=extra-all CFLAGS=-Wno-unused-parameter > > - Only has one (and new) -Wsigned-compare warning, about a comparison of > - "i" to online_cpus(), a subsequent commit will adjust & deal with > - online_cpus() and that warning. > + Only has one (and new) -Wsigned-compare warning relevant to a > + comparison about our "n" or "{nr,max}_processes": About using our > + "n" (size_t) in the same expression as online_cpus() (int). A > + subsequent commit will adjust & deal with online_cpus() and that > + warning. > > The only users of the "n" parameter are: > > @@ run-command.c: static void pp_cleanup(struct parallel_processes *pp) > > for (i = 0; i < pp->max_processes; i++) > if (pp->children[i].state == GIT_CP_FREE) > -@@ run-command.c: static int pp_start_one(struct parallel_processes *pp) > - > - static void pp_buffer_stderr(struct parallel_processes *pp, int output_timeout) > - { > -- int i; > -- > - while (poll(pp->pfd, pp->max_processes, output_timeout) < 0) { > - if (errno == EINTR) > - continue; > @@ run-command.c: static void pp_buffer_stderr(struct parallel_processes *pp, int output_timeout) > } > > 10: eb9d672b0d8 ! 5: e230701dff6 run-command API: don't fall back on online_cpus() > @@ Commit message > child processor, 2015-12-15). > > Most of our code in-tree that scales up to "online_cpus()" by default > - calls that function by itself. By having these callers of the > - "run_processes_parallel()" API do the same we can in subsequent > - commits pass all arguments down as a "const struct". > + calls that function by itself. Keeping this default behavior just for > + the sake of two callers means that we'd need to maintain this one spot > + where we're second-guessing the config passed down into pp_init(). > > The preceding commit has an overview of the API callers that passed > "jobs = 0". There were only two of them (actually three, but they > @@ submodule-config.c: int parse_submodule_fetchjobs(const char *var, const char *v > > ## t/t5526-fetch-submodules.sh ## > @@ t/t5526-fetch-submodules.sh: test_expect_success 'fetching submodules respects parallel settings' ' > + GIT_TRACE=$(pwd)/trace.out git fetch && > + grep "8 tasks" trace.out && > GIT_TRACE=$(pwd)/trace.out git fetch --jobs 9 && > - grep "9 tasks" trace.out && > - >trace.out && > +- grep "9 tasks" trace.out > ++ grep "9 tasks" trace.out && > ++ >trace.out && > + > + GIT_TRACE=$(pwd)/trace.out git -c submodule.fetchJobs=0 fetch && > + grep "preparing to run up to [0-9]* tasks" trace.out && > 11: aedda10d8e1 ! 6: df2ca5dd097 run-command.c: use designated init for pp_init(), add "const" > @@ run-command.c: void run_processes_parallel(size_t n, > - ungroup); > + pp_init(&pp, get_next_task, start_failure, task_finished); > while (1) { > - for (int i = 0; > + for (i = 0; > i < spawn_cap && !pp.shutdown && > 12: fde2af11579 < -: ----------- run-command API: add nascent "struct run_process_parallel_opts" > 13: 01e894bed90 < -: ----------- run-command API: make run_process_parallel{,_tr2}() thin wrappers > 14: 41c2886b44b ! 7: eaed3d8838d run-command API: have run_process_parallel() take an "opts" struct > @@ Metadata > ## Commit message ## > run-command API: have run_process_parallel() take an "opts" struct > > - Have the "run_process_parallel()" function take an "opts" struct, > - which allows us to eliminate it as a wrapper for > - "run_processes_parallel_1()". > + As noted in fd3aaf53f71 (run-command: add an "ungroup" option to > + run_process_parallel(), 2022-06-07) which added the "ungroup" passing > + it to "run_process_parallel()" via the global > + "run_processes_parallel_ungroup" variable was a compromise to get the > + smallest possible regression fix for "maint" at the time. > + > + This follow-up to that is a start at passing that parameter and others > + via a new "struct run_process_parallel_opts", as the earlier > + version[1] of what became fd3aaf53f71 did. > + > + Since we need to change all of the occurrences of "n" to > + "opt->SOMETHING" let's take the opportunity and rename the terse "n" > + to "processes". We could also have picked "max_processes", "jobs", > + "threads" etc., but as the API is named "run_processes_parallel()" > + let's go with "processes". > > Since the new "run_processes_parallel()" function is able to take an > optional "tr2_category" and "tr2_label" via the struct we can at this > point migrate all of the users of "run_processes_parallel_tr2()" over > to it. > > - But let's not migrate all the API users, only the two users that > + But let's not migrate all the API users yet, only the two users that > passed the "ungroup" parameter via the > - "run_processes_parallel_ungroup" global, allowing us to delete that > - global in favor of passing "ungroup" via the "opts" struct. As noted > - in fd3aaf53f71 (run-command: add an "ungroup" option to > - run_process_parallel(), 2022-06-07) which added > - "run_processes_parallel_ungroup" passing this as a global was a hack > - to produce a small regression fix for "maint". > + "run_processes_parallel_ungroup" global > + > + 1. https://lore.kernel.org/git/cover-v2-0.8-00000000000-20220518T195858Z-avarab@xxxxxxxxx/ > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> > > @@ run-command.c: enum child_state { > struct parallel_processes { > void *const data; > > +@@ run-command.c: static void handle_children_on_signal(int signo) > + } > + > + static void pp_init(struct parallel_processes *pp, > +- get_next_task_fn get_next_task, > +- start_failure_fn start_failure, > +- task_finished_fn task_finished) > ++ const struct run_process_parallel_opts *opts) > + { > +- const size_t n = pp->max_processes; > ++ const size_t n = opts->processes; > ++ get_next_task_fn get_next_task = opts->get_next_task; > ++ start_failure_fn start_failure = opts->start_failure; > ++ task_finished_fn task_finished = opts->task_finished; > + > + if (!n) > + BUG("you must provide a non-zero number of processes!"); > @@ run-command.c: static int pp_collect_finished(struct parallel_processes *pp) > return result; > } > > --static void run_processes_parallel_1(const struct run_process_parallel_opts *opts, void *pp_cb) > +-void run_processes_parallel(size_t n, > +- get_next_task_fn get_next_task, > +- start_failure_fn start_failure, > +- task_finished_fn task_finished, > +- void *pp_cb) > +void run_processes_parallel(const struct run_process_parallel_opts *opts) > { > - int code; > + int i, code; > int output_timeout = 100; > int spawn_cap = 4; > +- int ungroup = run_processes_parallel_ungroup; > struct parallel_processes pp = { > - .max_processes = opts->processes, > +- .max_processes = n, > - .data = pp_cb, > ++ .max_processes = opts->processes, > + .data = opts->data, > .buffered_output = STRBUF_INIT, > -- .ungroup = run_processes_parallel_ungroup, > +- .ungroup = ungroup, > + .ungroup = opts->ungroup, > }; > - /* options */ > - const char *tr2_category = opts->tr2_category; > - const char *tr2_label = opts->tr2_label; > - const int do_trace2 = tr2_category && tr2_label; > ++ /* options */ > ++ const char *tr2_category = opts->tr2_category; > ++ const char *tr2_label = opts->tr2_label; > ++ const int do_trace2 = tr2_category && tr2_label; > > - /* unset for the next API user */ > - run_processes_parallel_ungroup = 0; > -- > - if (do_trace2) > - trace2_region_enter_printf(tr2_category, tr2_label, NULL, > - "max:%d", opts->processes); > -@@ run-command.c: static void run_processes_parallel_1(const struct run_process_parallel_opts *opt > - trace2_region_leave(tr2_category, tr2_label, NULL); > ++ if (do_trace2) > ++ trace2_region_enter_printf(tr2_category, tr2_label, NULL, > ++ "max:%d", opts->processes); > + > +- pp_init(&pp, get_next_task, start_failure, task_finished); > ++ pp_init(&pp, opts); > + while (1) { > + for (i = 0; > + i < spawn_cap && !pp.shutdown && > +@@ run-command.c: void run_processes_parallel(size_t n, > + } > + if (!pp.nr_processes) > + break; > +- if (ungroup) { > ++ if (opts->ungroup) { > + for (size_t i = 0; i < pp.max_processes; i++) > + pp.children[i].state = GIT_CP_WAIT_CLEANUP; > + } else { > +@@ run-command.c: void run_processes_parallel(size_t n, > + } > + > + pp_cleanup(&pp); > ++ > ++ if (do_trace2) > ++ trace2_region_leave(tr2_category, tr2_label, NULL); > } > > --void run_processes_parallel(size_t processes, > -- get_next_task_fn get_next_task, > -- start_failure_fn start_failure, > -- task_finished_fn task_finished, > -- void *pp_cb) > --{ > -- const struct run_process_parallel_opts opts = { > -- .processes = processes, > -- .ungroup = run_processes_parallel_ungroup, > -- > -- .get_next_task = get_next_task, > -- .start_failure = start_failure, > -- .task_finished = task_finished, > -- }; > -- > -- run_processes_parallel_1(&opts, pp_cb); > --} > -- > - void run_processes_parallel_tr2(size_t processes, get_next_task_fn get_next_task, > +-void run_processes_parallel_tr2(size_t n, get_next_task_fn get_next_task, > ++void run_processes_parallel_tr2(size_t processes, get_next_task_fn get_next_task, > start_failure_fn start_failure, > task_finished_fn task_finished, void *pp_cb, > -@@ run-command.c: void run_processes_parallel_tr2(size_t processes, get_next_task_fn get_next_task > - const struct run_process_parallel_opts opts = { > - .tr2_category = tr2_category, > - .tr2_label = tr2_label, > -- > - .processes = processes, > -- .ungroup = run_processes_parallel_ungroup, > + const char *tr2_category, const char *tr2_label) > + { > +- trace2_region_enter_printf(tr2_category, tr2_label, NULL, "max:%d", n); > ++ const struct run_process_parallel_opts opts = { > ++ .tr2_category = tr2_category, > ++ .tr2_label = tr2_label, > ++ .processes = processes, > > - .get_next_task = get_next_task, > - .start_failure = start_failure, > - .task_finished = task_finished, > - }; > +- run_processes_parallel(n, get_next_task, start_failure, > +- task_finished, pp_cb); > ++ .get_next_task = get_next_task, > ++ .start_failure = start_failure, > ++ .task_finished = task_finished, > ++ }; > > -- run_processes_parallel_1(&opts, pp_cb); > +- trace2_region_leave(tr2_category, tr2_label, NULL); > + run_processes_parallel(&opts); > } > > int run_auto_maintenance(int quiet) > > ## run-command.h ## > -@@ run-command.h: struct run_process_parallel_opts > - * NULL to omit any special handling. > - */ > - task_finished_fn task_finished; > +@@ run-command.h: typedef int (*task_finished_fn)(int result, > + void *pp_task_cb); > + > + /** > +- * Runs up to n processes at the same time. Whenever a process can be > +- * started, the callback get_next_task_fn is called to obtain the data > ++ * Option used by run_processes_parallel(), { 0 }-initialized means no > ++ * options. > ++ */ > ++struct run_process_parallel_opts > ++{ > ++ /** > ++ * tr2_category & tr2_label: sets the trace2 category and label for > ++ * logging. These must either be unset, or both of them must be set. > ++ */ > ++ const char *tr2_category; > ++ const char *tr2_label; > ++ > ++ /** > ++ * processes: see 'processes' in run_processes_parallel() below. > ++ */ > ++ size_t processes; > ++ > ++ /** > ++ * ungroup: see 'ungroup' in run_processes_parallel() below. > ++ */ > ++ unsigned int ungroup:1; > ++ > ++ /** > ++ * get_next_task: See get_next_task_fn() above. This must be > ++ * specified. > ++ */ > ++ get_next_task_fn get_next_task; > ++ > ++ /** > ++ * start_failure: See start_failure_fn() above. This can be > ++ * NULL to omit any special handling. > ++ */ > ++ start_failure_fn start_failure; > ++ > ++ /** > ++ * task_finished: See task_finished_fn() above. This can be > ++ * NULL to omit any special handling. > ++ */ > ++ task_finished_fn task_finished; > + > + /** > + * data: user data, will be passed as "pp_cb" to the callback > + * parameters. > + */ > + void *data; > - }; > - > - /** > ++}; > ++ > ++/** > + * Options are passed via the "struct run_process_parallel_opts" above. > + * > - * Runs N 'processes' at the same time. Whenever a process can be > -- * started, the callback get_next_task_fn is called to obtain the data > ++ * Runs N 'processes' at the same time. Whenever a process can be > + * started, the callback opts.get_next_task is called to obtain the data > * required to start another child process. > * > * The children started via this function run in parallel. Their output > -@@ run-command.h: struct run_process_parallel_opts > + * (both stdout and stderr) is routed to stderr in a manner that output > + * from different tasks does not interleave (but see "ungroup" below). > + * > +- * start_failure_fn and task_finished_fn can be NULL to omit any > +- * special handling. > +- * > + * If the "ungroup" option isn't specified, the API will set the > + * "stdout_to_stderr" parameter in "struct child_process" and provide > + * the callbacks with a "struct strbuf *out" parameter to write output > +@@ run-command.h: typedef int (*task_finished_fn)(int result, > * NULL "struct strbuf *out" parameter, and are responsible for > * emitting their own output, including dealing with any race > * conditions due to writing in parallel to stdout and stderr. > @@ run-command.h: struct run_process_parallel_opts > - * API reads that setting. > */ > -extern int run_processes_parallel_ungroup; > --void run_processes_parallel(size_t processes, > +-void run_processes_parallel(size_t n, > - get_next_task_fn, > - start_failure_fn, > - task_finished_fn, > - void *pp_cb); > +-void run_processes_parallel_tr2(size_t n, get_next_task_fn, start_failure_fn, > +- task_finished_fn, void *pp_cb, > +void run_processes_parallel(const struct run_process_parallel_opts *opts); > - void run_processes_parallel_tr2(size_t processes, get_next_task_fn, > - start_failure_fn, task_finished_fn, void *pp_cb, > ++void run_processes_parallel_tr2(size_t processes, get_next_task_fn, > ++ start_failure_fn, task_finished_fn, void *pp_cb, > const char *tr2_category, const char *tr2_label); > + > + /** > > ## t/helper/test-run-command.c ## > @@ t/helper/test-run-command.c: static const char * const testsuite_usage[] = { > 15: 391d1d99d91 = 8: c19c60b2e95 run-command API: move *_tr2() users to "run_processes_parallel()" > 16: acac50cc1a5 = 9: 99a4f4f6b9c run-command.c: make "struct parallel_processes" const if possible > 17: fdd64236985 = 10: bf67e24bcc5 run-command.c: don't copy *_fn to "struct parallel_processes" > 18: 17f34d81ecd = 11: 06de42adc2e run-command.c: don't copy "ungroup" to "struct parallel_processes" > 19: 9cbee2dfe76 = 12: 3081dfc49d1 run-command.c: don't copy "data" to "struct parallel_processes" > 20: 2dabed9e155 ! 13: b7c10f6a23f run-command.c: use "opts->processes", not "pp->max_processes" > @@ run-command.c: static int pp_collect_finished(struct parallel_processes *pp, > > @@ run-command.c: void run_processes_parallel(const struct run_process_parallel_opts *opts) > while (1) { > - for (int i = 0; > + for (i = 0; > i < spawn_cap && !pp.shutdown && > - pp.nr_processes < pp.max_processes; > + pp.nr_processes < opts->processes; > 21: c1a286a8ebb ! 14: 4856d6a4674 run-command.c: pass "opts" further down, and use "opts->processes" > @@ run-command.c: static int pp_start_one(struct parallel_processes *pp, > + const struct run_process_parallel_opts *opts, > + int output_timeout) > { > -- while (poll(pp->pfd, pp->max_processes, output_timeout) < 0) { > -+ while (poll(pp->pfd, opts->processes, output_timeout) < 0) { > + int i; > + > +- while ((i = poll(pp->pfd, pp->max_processes, output_timeout)) < 0) { > ++ while ((i = poll(pp->pfd, opts->processes, output_timeout) < 0)) { > if (errno == EINTR) > continue; > - pp_cleanup(pp); > 22: 541f41566e7 ! 15: 39a20be0cbb run-command.c: remove "pp->max_processes", add "const" to signal() handler > @@ Metadata > Author: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> > > ## Commit message ## > - run-command.c: remove "pp->max_processes", add "const" to signal() handler > + run-command.c: remove "max_processes", add "const" to signal() handler > > As with the *_fn members removed in a preceding commit, let's not copy > the "processes" member of the "struct run_process_parallel_opts" over > @@ run-command.c: static void pp_init(struct parallel_processes *pp, > - pp_for_signal = pp; > + pp_sig->pp = pp; > + pp_sig->opts = opts; > ++ pp_for_signal = pp_sig; > sigchain_push_common(handle_children_on_signal); > } > > @@ run-command.c: void run_processes_parallel(const struct run_process_parallel_opts *opts) > - int code; > + int i, code; > int output_timeout = 100; > int spawn_cap = 4; > + struct parallel_processes_for_signal pp_sig; > @@ run-command.c: void run_processes_parallel(const struct run_process_parallel_opt > - pp_init(&pp, opts); > + pp_init(&pp, opts, &pp_sig); > while (1) { > - for (int i = 0; > + for (i = 0; > i < spawn_cap && !pp.shutdown && > @@ run-command.c: void run_processes_parallel(const struct run_process_parallel_opts *opts) > continue; > -- > 2.38.0.971.ge79ff6d20e7 >