Hi Tom, On Tue, 6 Oct 2020, Tom Clarkson via GitGitGadget wrote: > From: Tom Clarkson <tom@xxxxxxxxxxxxxx> > > Include recursion depth in debug logs so we can see when the recursion is > getting out of hand. > > Making the cache handle null mappings correctly and adding older commits > to the cache allows the recursive algorithm to terminate at any point on > mainline rather than needing to reach either the add point or the initial > commit. Makes sense. > diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh > index 9867718503..160bad95c1 100755 > --- a/contrib/subtree/git-subtree.sh > +++ b/contrib/subtree/git-subtree.sh > @@ -244,7 +244,7 @@ check_parents () { > do > if ! test -r "$cachedir/notree/$miss" > then > - debug " incorrect order: $miss" > + debug " unprocessed parent commit: $miss ($indent)" Without any context, it is hard to understand what the `$indent` variable is supposed to mean, so it is unclear why we need to print it here. I _guess_ it is the degree removed from the first-parent lineage? In any case, it does not hurt here, so I trust that it is good to include it in the debug output. > process_split_commit "$miss" "" "$indent" > fi > done > @@ -392,6 +392,24 @@ find_existing_splits () { > done > } > > +find_mainline_ref () { > + debug "Looking for first split..." > + dir="$1" > + revs="$2" The `git-subtree` script seems to rely on the `local` construct, using it in plenty of other circumstances. How about using it here, too? > + > + git log --reverse --grep="^git-subtree-dir: $dir/*\$" \ > + --no-show-signature --pretty=format:'START %H%n%s%n%n%b%nEND%n' $revs | Since all you are interested in is the `git-subtree-mainline:` trailer, wouldn't a format like `%(trailers:key=git-subtree-mainline)` instead of `START %H%n%s%n%n%b%nEND%n`? See https://git-scm.com/docs/git-log#Documentation/git-log.txt-emtrailersoptionsem for more information about pretty formats. BTW I am super unfamiliar with `git subtree`'s inner workings, and therefore it would help me incredibly if the commit message talked a bit about the commit message layout (with a particular eye on `git-subtree-dir` and `git-subtree-mainline` which I guess are trailers added by `git subtree`?)... > + while read a b junk > + do > + case "$a" in > + git-subtree-mainline:) > + echo "$b" > + return > + ;; > + esac > + done > +} > + > copy_commit () { > # We're going to set some environment vars here, so > # do it in a subshell to get rid of them safely later > @@ -646,9 +664,9 @@ process_split_commit () { > > progress "$revcount/$revmax ($createcount) [$extracount]" > > - debug "Processing commit: $rev" > + debug "Processing commit: $rev ($indent)" > exists=$(cache_get "$rev") > - if test -n "$exists" > + if test -z "$(cache_miss "$rev")" > then > debug " prior: $exists" I do not see the `exists` variable being used other than for the debug statement. Maybe better something like this? debug " prior found for $rev" > return > @@ -773,6 +791,17 @@ cmd_split () { > > unrevs="$(find_existing_splits "$dir" "$revs")" > > + mainline="$(find_mainline_ref "$dir" "$revs")" > + if test -n "$mainline" > + then > + debug "Mainline $mainline predates subtree add" > + git rev-list --topo-order --skip=1 $mainline | > + while read rev > + do > + cache_set "$rev" "" Ah, so they are not really "null mappings", but mapped to an empty string. Makes sense. Maybe adjust the commit message? > + done || exit $? > + fi > + > # We can't restrict rev-list to only $dir here, because some of our > # parents have the $dir contents the root, and those won't match. > # (and rev-list --follow doesn't seem to solve this) > -- > gitgitgadget > >