I haven't verified if the code in this version is correct or not, as I found it a bit difficult to follow through the churn. After reading this series again, I've established a better mental model of the code, and I think there are some renames and documentation changes we can make to make this clearer. Unfortunately, I think the biggest clarification would be _yet_ another refactor, and I'm not sure if we actually want to bear so much churn. I might do this refactor locally to see if it really is _much_ cleaner or not. If anyone has thoughts on the refactor, do chime in. Calvin Wan <calvinwan@xxxxxxxxxx> writes: > diff --git a/diff-lib.c b/diff-lib.c > index 744ae98a69..7fe6ced950 100644 > --- a/diff-lib.c > +++ b/diff-lib.c > @@ -65,26 +66,41 @@ static int check_removed(const struct index_state *istate, const struct cache_en > * Return 1 when changes are detected, 0 otherwise. If the DIRTY_SUBMODULES > * option is set, the caller does not only want to know if a submodule is > * modified at all but wants to know all the conditions that are met (new > - * commits, untracked content and/or modified content). > + * commits, untracked content and/or modified content). If > + * defer_submodule_status bit is set, dirty_submodule will be left to the > + * caller to set. defer_submodule_status can also be set to 0 in this > + * function if there is no need to check if the submodule is modified. > */ > static int match_stat_with_submodule(struct diff_options *diffopt, > const struct cache_entry *ce, > struct stat *st, unsigned ce_option, > - unsigned *dirty_submodule) > + unsigned *dirty_submodule, int *defer_submodule_status, > + unsigned *ignore_untracked) > { > int changed = ie_match_stat(diffopt->repo->index, ce, st, ce_option); > + int defer = 0; > + > if (S_ISGITLINK(ce->ce_mode)) { > struct diff_flags orig_flags = diffopt->flags; > if (!diffopt->flags.override_submodule_config) > set_diffopt_flags_from_submodule_config(diffopt, ce->name); > - if (diffopt->flags.ignore_submodules) > + if (diffopt->flags.ignore_submodules) { > changed = 0; > - else if (!diffopt->flags.ignore_dirty_submodules && > - (!changed || diffopt->flags.dirty_submodules)) > - *dirty_submodule = is_submodule_modified(ce->name, > - diffopt->flags.ignore_untracked_in_submodules); > + } else if (!diffopt->flags.ignore_dirty_submodules && > + (!changed || diffopt->flags.dirty_submodules)) { > + if (defer_submodule_status && *defer_submodule_status) { > + defer = 1; > + *ignore_untracked = diffopt->flags.ignore_untracked_in_submodules; > + } else { > + *dirty_submodule = is_submodule_modified(ce->name, > + diffopt->flags.ignore_untracked_in_submodules); > + } > + } > diffopt->flags = orig_flags; > } > + > + if (defer_submodule_status) > + *defer_submodule_status = defer; The crux of this patch is that we are replacing some serial operation with a parallel operation. The replacement happens here, where we are replacing is_submodule_modified() by 'deferring' it. So to verify if the parallel implementation is correct, we should compare the "setup" and "finish" steps in is_submodule_modified() and get_submodules_status(). Eyeballing it, it looks correct, especially because we made sure to refactor out the shared logic in previous patches. To reflect this, I think it would be clearer to rename get_submodules_status() to something similar (e.g. are_submodules_modified_parallel()), with an explicit comment saying that it is meant to be a parallel implementation of is_submodule_modified(). Except, I told a little white lie in the previous paragraph, because get_submodules_status() isn't _just_ a parallel implementation of is_submodule_modified()... > @@ -268,13 +286,52 @@ int run_diff_files(struct rev_info *revs, unsigned int option) > } > > changed = match_stat_with_submodule(&revs->diffopt, ce, &st, > - ce_option, &dirty_submodule); > + ce_option, NULL, > + &defer_submodule_status, > + &ignore_untracked); > newmode = ce_mode_from_stat(ce, st.st_mode); > + if (defer_submodule_status) { > + struct submodule_status_util tmp = { > + .changed = changed, > + .dirty_submodule = 0, > + .ignore_untracked = ignore_untracked, > + .newmode = newmode, > + .ce = ce, > + .path = ce->name, > + }; > + struct string_list_item *item; > + > + item = string_list_append(&submodules, ce->name); > + item->util = xmalloc(sizeof(tmp)); > + memcpy(item->util, &tmp, sizeof(tmp)); > + continue; > + } because get_submodules_status() doesn't just contain the results of the parallel processes, it is _also_ shuttling "changed" and "ignore_untracked" from match_stat_with_submodule(), as well as .newmode, .ce and .path from run_diff_files() (basically everything except .dirty_submodule)... > } > > - record_file_diff(&revs->diffopt, newmode, dirty_submodule, > - changed, istate, ce); > + if (!defer_submodule_status) > + record_file_diff(&revs->diffopt, newmode, 0, > + changed,istate, ce); > + } > + if (submodules.nr) { > + unsigned long parallel_jobs; > + struct string_list_item *item; > + > + if (git_config_get_ulong("submodule.diffjobs", ¶llel_jobs)) > + parallel_jobs = 1; > + else if (!parallel_jobs) > + parallel_jobs = online_cpus(); > + > + if (get_submodules_status(&submodules, parallel_jobs)) > + die(_("submodule status failed")); > + for_each_string_list_item(item, &submodules) { > + struct submodule_status_util *util = item->util; > + > + record_file_diff(&revs->diffopt, util->newmode, > + util->dirty_submodule, util->changed, > + istate, util->ce); > + } so that we can pass all of this back into record_file_diff(). The only member that is changed by the parallel process is .dirty_submodule, which is exactly what we would expect from a parallel version of is_submodule_modified(). If we don't want to do a bigger refactor, I think we should also add comments to members of "struct submodule_status_util" to document where they come from and what they are used for. The rest of the comments are refactor-related. It would be good if we could avoid mixing unrelated information sources in "struct submodule_status_util", since a) this makes it very tightly coupled to run_diff_files() and b) it causes us to repeat ourselves in the same function (.changed = changed, record_file_diff()). The only reason why the code looks this way right now is that match_stat_with_submodule() sets defer_submodule_status based on whether or not we should ignore the submodule, and this eventually tells get_submodule_status() what submodules it needs to care about. But, deciding whether to spawn a subprocess for which submodule is exactly what the .get_next_task member is for. > diff --git a/submodule.c b/submodule.c > index 426074cebb..6f6e150a3f 100644 > --- a/submodule.c > +++ b/submodule.c > @@ -1981,6 +1994,121 @@ unsigned is_submodule_modified(const char *path, int ignore_untracked) > return dirty_submodule; > } > > +static struct status_task * > +get_status_task_from_index(struct submodule_parallel_status *sps, > + struct strbuf *err) > +{ > + for (; sps->index_count < sps->submodule_names->nr; sps->index_count++) { > + struct submodule_status_util *util = sps->submodule_names->items[sps->index_count].util; > + struct status_task *task; > + > + if (!verify_submodule_git_directory(util->path)) > + continue; So right here, we could use the "check if this submodule should be ignored" logic form match_stat_with_submodule() to decide whether or not to spawn the subprocess. IOW, I am advocating for get_submodules_status() to be a parallel version of match_stat_with_submodule() (not a parallel version of is_submodule_modified() that shuttles extra information). Another sign that this refactor is a good idea is that it lets us simplify _existing_ submodule logic in run_diff_files(). Prior to this patch, we have: unsigned dirty_submodule = 0; ... changed = match_stat_with_submodule(&revs->diffopt, ce, &st, ce_option, NULL, &defer_submodule_status, &ignore_untracked); // If submodule was deferred, shuttle a bunch of information // If not, call record_file_diff() but the body of match_stat_with_submodule() is just ie_match_stat() + some additional submodule logic. Post refactor, this would look something like: struct string_list submodules; ... // For any submodule, just append it to a list and let the // parallel thing take care of it. if (S_ISGITLINK(ce->ce_mode) { // Probably pass .newmode and .ce to the util too... string_list_append(submodules, ce->name); } else { changed = ie_match_stat(foo, bar, baz); record_file_diff(); } ... if (submodules.nr) { parallel_match_stat_with_submodule_wip_name(&submodules); for_each_string_list_item(item, &submodules) { record_file_diff(&item); } } Which I think is easier to follow, since we won't need defer_submodule_status any more, and we don't shuttle information from match_stat_with_submodule(). Though I'm a bit unhappy that it's still pretty coupled to run_diff_files() (it still has to shuttle .newmode, .ce). Also, I don't think this refactor lets us avoid the refactors we did in the previous patches. > + > + task = xmalloc(sizeof(*task)); > + task->path = util->path; > + task->ignore_untracked = util->ignore_untracked; > + strbuf_init(&task->out, 0); > + sps->index_count++; > + return task; > + } > + return NULL; > +} > + > +static int get_next_submodule_status(struct child_process *cp, > + struct strbuf *err, void *data, > + void **task_cb) > +{ > + struct submodule_parallel_status *sps = data; > + struct status_task *task = get_status_task_from_index(sps, err); As an aside, I think we can inline get_status_task_from_index(). I suspect this pattern was copied from get_next_submodule(), which gets fetch tasks from two different places (hence _from_index and _from_changed), but here I don't think we will ever get status tasks from more than one place.