[If this has already gone through multiple times, I apologize for the repetition; I have had a hard time getting GMail to send this. Past versions had attachments, which I believe contributed to failures. This one has none, but has links to all the content.] Hello all, I have a concern about the performance of git-status with many (~38) submodules. As part of a (large-scale) system dynamics class, I was tasked with identifying a performance problem, tracing it using KUTrace(2)[3], and subsequently investigating it. I ended up with some unique observations about git-status and submodules[2]. The interactive HTML traces are available on Google Drive[4][5]. I won't recreate all the details here, but I would encourage you to play with the traces, or at least go through the slides. ### The short-version Git status is slow(3). ### Baseline - time git-status, with many submodules, and --ignore-submodules=none 0.497s - time git-status in non-submodule heavy repos 0.014s ### What I consider a temporary fix - time git-status, with many submodules, and --ignore-submodules=all 0.026s ### What I would like to see I would like to improve the git-status performance with this many submodules, so that I can remove diff.ignoreSubmodules=none from my config (it is useful information, and the flag affects many commands). I would be willing to work on a discussed and designed fix. ### What I am curious about >From the traces (attached), it appears that git-status suffers from a lack of (possibly embarrassing) parallelism: I would expect each submodule to be independently check-able, but the process section of the trace has them executing serially (for reasons unknown to me). The apparent need to fork/exec many processes in this way appears to also be a source of latency, along with the very large number of filesystem-related syscalls (if my understanding is correct). What can we do to fix this? Is there a reason for this (really terribly slow) serial execution? Is this something developers haven't bothered to optimize ("unexpected use case")? If so, I would like to discuss taking a crack at it, because I do have at least one repository with this many submodules, and I care about its performance. --- Notes 1) All timings were taken with the https://github.com/benknoble/Dotfiles repo from around commit da194a8f4104a9fc74e8895ebc8512434f07d393 2) KUTrace is a set of kernel patches and userspace programs that provide low-overhead tracing, as well as post-processing those traces 3) Timings taken on my machine (2012 macbook pro; can provide more details if requested) --- Links [1]: https://docs.google.com/presentation/d/1z-6ffE9KY-Jswl2BiWzYV2DG6fOutgWSi_aZ5uql__s/edit?usp=sharing [2]: https://benknoble.github.io/blog/2019/11/07/git-stat/ [3]: https://github.com/dicksites/KUtrace [4]: https://drive.google.com/file/d/1JyYO420yWp7XvNJJ8HLOPU0o6mesSKZf/view?usp=sharing [5]: https://drive.google.com/file/d/1BqqxH0PRCYz_vvYkBBFpbL5dkFTLPyuK/view?usp=sharing