"D. Ben Knoble" <ben.knoble@xxxxxxxxx> writes: > ### What I am curious about > > From the traces (attached), it appears that git-status suffers from a lack of > (possibly embarrassing) parallelism: I would expect each submodule to be > independently check-able, ... > ... > What can we do to fix this? Is there a reason for this (really terribly slow) > serial execution? Is this something developers haven't bothered to optimize > ("unexpected use case")? If so, I would like to discuss taking a crack at it, > because I do have at least one repository with this many submodules, and I > care about its performance. Nice to hear from somebody who cares about improving submodule support. I offhand do not think of a reason why we inherently have to process them serially. But the way "git status" code is structured, it probably takes a bit of preparatory refactoring. If I recall correctly, it walks each path in the index in the superproject and notes how the file in the working tree is different from that of the index and the HEAD, under the assumption that inspection of each path is relatively cheap and at the same cost. You'd first need to restructure that part so that inspecting groups of index entries can be sharded to separate subprocesses while the parent process waits, and have them report to the parent process, and let the parent process continue with the aggregated result, or something like that. Thanks.