On Fri, Jun 10 2022, Glen Choo via GitGitGadget wrote: > As a follow up to ar/submodule-update [1] and its successors > gc/submodule-update-part* [2] [3], this series converts the last remaining > piece of "git submodule update" into C, namely, the option parsing in > git-submodule.sh. Aside at the end at [2]. > As a result, git-submodule.sh::cmd_update() is now an (almost) one-liner: > > cmd_update() { git ${wt_prefix:+-C "$wt_prefix"} submodule--helper update > ${wt_prefix:+--prefix "$wt_prefix"} > "$@" } > > and best of all, "git submodule update" now shows a usage string for its own > subcommand instead of a giant usage string for all of "git submodule" :) > > Given how many options "git submodule update" accepts, this series takes a > gradual approach: > > 1. Create a variable opts, which holds the literal options we want to pass > to "git submodule--helper update". Then, for each option... > 2. If "git submodule--helper update" already understands the string option, > append it to opts and remove any special handling (1-3/8). > 3. Otherwise, if the option makes sense, teach "git submodule--helper > update" to understand the option. Goto 2. (4-5/8). > 4. Otherwise, if the option makes no sense, drop it (6/8). > 5. When we've processed all options, delete all the option parsing code > (7/8) and clean up (8/8). That's quite the timing coincidence. I hacked this up yesterday, thinking that the submodule topic had been too quiet for a while, and wondering how hard it was to convert the rest of git-submodule.sh. It's more than 2x the length of yours, but gets to the point where we can "git rm git-submodule.sh". Some brief comparison/commentary: > Glen Choo (8): > submodule update: remove intermediate parsing > submodule update: pass options containing "[no-]" > submodule update: pass options with stuck forms Yeah, this is the alternate approach I considered and ended up discarding. I.e. to make forward progress with migrating things away from the cmd_*() functions you either have to prepare things in advance and then sweep the rug from under them in one go. Or, as you're doing here teaching them about the options they're not-really-parsing anymore, but must know about because they're in a loop that ends with a "if unknown option, usage". > submodule update: pass --require-init and --init Almost the same as my 12/20. > submodule--helper update: use one param per type Same as my 13/20, but I ended up doing it in a more narrow/smaller way. I tried your way and ran into some bug, then figured I'd do it more narrowly instead of debugging it. > submodule update: remove -v, pass --quiet Hrm, so we don't need it at all then. Well, that's a bit simpler than my 1[45]/20 and 17/20 :) So yeah, definitely RFC-quality, but I ran into that one test that used -v, and then saw the missing docs etc. But no cheating, so I've left it in :) I do wonder if we should leave it in anyway, we never documented -v, but we *did* understand it, and if you look at: git log -p -Gsay -- git-submodule.sh We used to have a lot more code impacted by it, but looking at this again now it would have only been for users of command-lines like: git submodule --quiet update -v [...] I.e. where we already set the flag to the non-default quiet, and then used -v to flip it. I think at this point I've talked myself into "let's just remove it", but maybe... > submodule update: stop parsing options in .sh Same effect as my 16/20, but it's the last one I converted, the cmd_update() case being the trickiest. > submodule update: remove never-used expansion Same as my 02/20, but as seen there I think you missed several "prefix" non-uses. Brief commentary on my patches, details in commit messages: Ævar Arnfjörð Bjarmason (20): git-submodule.sh: remove unused sanitize_submodule_env() git-submodule.sh: remove unused $prefix variable git-submodule.sh: remove unused --super-prefix logic I removed a bit more dead code here than yours. git-submodule.sh: normalize parsing of "--branch" git-submodule.sh: normalize parsing of --cached This & various other prep commits (hereafter "easy prep") make subsequent one-time conversions of whole cmd_*() easier. submodule--helper: rename "absorb-git-dirs" to "absorbgitdirs" git-submodule.sh: create a "case" dispatch statement easy prep submodule--helper: pretend to be "git submodule" in "-h" output easy prep & bug fix for existing (on master) output bugs. git-submodule.sh: dispatch "sync" to helper git-submodule.sh: dispatch directly to helper git-submodule.sh: dispatch "foreach" to helper These are easy conversions as the options 1=1 map after the above prep. submodule--helper: have --require-init imply --init submodule--helper: understand --checkout, --merge and --rebase synonyms git-submodule doc: document the -v" option to "update" submodule--helper: understand -v option for "update" not-so-easy prep for "cmd_update()" git-submodule.sh: dispatch "update" to helper Full cmd_update() migration in one go. git-submodule.sh: use "$quiet", not "$GIT_QUIET" "easy prep", but this one is less overall churn if done at the end, but as noted above could/should maybe be dropped entirely. git-submodule.sh: simplify parsing loop Not really needed, but I wanted to get the code as close to minimal for the next step, to eyeball the resulting sh v.s. C version. submodule: make it a built-in, remove git-submodule.sh We now have a builtin/submodule.c *and* the current builtin/submodule--helper.c, and we even dispatch to "git submodule--helper" via run_command()! The idea is to be as close as possible to a bug-for-bug implementation of the shellscript, and that reviewers should be confident in being able to trace what commands we invoked before/after, we're invoking the same "git submodule--helper" commands. Of course we eventually want to get to some full union of builtin/submodule{,--helper}.c, but that can wait. submodule: add a subprocess-less submodule.useBuiltin setting Wait, a useBuiltin setting to switch between two built-ins? Yeah, maybe it makes little sense, but here we get rid of the run_command() overhead, and could generally use the built-in to experiment with deeper integration between the two. Performance is around ~2x faster with the "real" built-in than the run_command() version, whic hin turn is more than 6x as fast on basic overhead than the shellscript version, to the extent that anyone cares about "git submodule" overhead. See [1] at the end for a benchmark. That last change adds a CI target for GIT_TEST_SUBMODULE_USE_BUILTIN=true, full CI run here: https://github.com/avar/git/actions/runs/2472131257 Documentation/config/submodule.txt | 4 + Documentation/git-submodule.txt | 8 +- Makefile | 2 +- builtin.h | 1 + builtin/submodule--helper.c | 118 +++--- builtin/submodule.c | 169 ++++++++ ci/run-build-and-tests.sh | 1 + git-sh-setup.sh | 7 - git-submodule.sh | 637 ----------------------------- git.c | 1 + submodule.c | 2 +- t/README | 4 + 12 files changed, 255 insertions(+), 699 deletions(-) create mode 100644 builtin/submodule.c delete mode 100755 git-submodule.sh 1. GIT_TEST_SUBMODULE_USE_BUILTIN=true git hyperfine -L rev origin/master,HEAD~0 -L v false,true -s 'make CFLAGS=-O3' 'GIT_TEST_SUBMODULE_USE_BUILTIN={v} ./git --exec-path=$PWD submodule status' -r 20 Benchmark 1: GIT_TEST_SUBMODULE_USE_BUILTIN=false ./git --exec-path=$PWD submodule status' in 'origin/master Time (mean ± σ): 40.9 ms ± 0.3 ms [User: 33.3 ms, System: 9.7 ms] Range (min … max): 40.2 ms … 41.5 ms 20 runs Benchmark 2: GIT_TEST_SUBMODULE_USE_BUILTIN=false ./git --exec-path=$PWD submodule status' in 'HEAD~0 Time (mean ± σ): 12.4 ms ± 0.1 ms [User: 9.9 ms, System: 2.5 ms] Range (min … max): 12.2 ms … 12.7 ms 20 runs Benchmark 3: GIT_TEST_SUBMODULE_USE_BUILTIN=true ./git --exec-path=$PWD submodule status' in 'origin/master Time (mean ± σ): 40.9 ms ± 0.5 ms [User: 35.6 ms, System: 7.2 ms] Range (min … max): 40.1 ms … 41.8 ms 20 runs Benchmark 4: GIT_TEST_SUBMODULE_USE_BUILTIN=true ./git --exec-path=$PWD submodule status' in 'HEAD~0 Time (mean ± σ): 6.4 ms ± 0.1 ms [User: 3.9 ms, System: 2.5 ms] Range (min … max): 6.3 ms … 6.6 ms 20 runs Summary 'GIT_TEST_SUBMODULE_USE_BUILTIN=true ./git --exec-path=$PWD submodule status' in 'HEAD~0' ran 1.94 ± 0.03 times faster than 'GIT_TEST_SUBMODULE_USE_BUILTIN=false ./git --exec-path=$PWD submodule status' in 'HEAD~0' 6.40 ± 0.11 times faster than 'GIT_TEST_SUBMODULE_USE_BUILTIN=true ./git --exec-path=$PWD submodule status' in 'origin/master' 6.40 ± 0.10 times faster than 'GIT_TEST_SUBMODULE_USE_BUILTIN=false ./git --exec-path=$PWD submodule status' in 'origin/master' 2. Aside: I don't think these ever made it on-list but Atharva's version of what we're trying to do here is at: https://github.com/tfidfwastaken/git/tree/submodule-make-builtin-2 I'd looked those over at some distant point in the past, and skimmed them again yesterday, but thought they were too much all-at-once to be confident in testing it myself, hence coming up with this alternate & smaller approach. -- 2.36.1.1178.gb5b1747c546