Bug description: Unless you use --ignore-joins, "git subtree split"'s optimization to avoid re-scanning all of history can trim too much. Any new merged branches that have parents before the previous "split" will not be re-attached properly in the split-off subtree. In the extreme case (if all the changes occurred on such branches), I've seen the old subtree head might not be an ancestor of the new head at all. This "fix" is only for illustration. It probably should not be included as-is; it probably scales worse than using --ignore-joins for anything much beyond trivial cases. I'm not sure it is possible to speed it up much (while remaining correct) without switching to a better language than shell script. I'm not sure how to explain the constraint clearly (or even precisely define the "minimal" constraint). One attempt: Exclude from "unrevs" any rejoin commit X for which ANY new commit (not just revs itself) has some other common ancestor (merge base) besides X. ("New commit" is one that is not an ancestor of some previous rejoin commit.) It would probably be fairly straightforward to adapt graph traversal algorithms to do this efficiently, but as near as I can tell, none of the existing options in git-rev-list or git-merge-base really does anything useful for optimizing this in shell script... The simplest traversal technique to fix this might be if there was a way to make history traversal only stop at SPECIFIC unrevs, instead of any ancestor of any unrevs. Other workarounds besides this patch: * Use --ignore-joins and wait for the really slow processing... * Plan ahead and delay: After using git subtree split --rejoin, leave the rejoin merge commit off on a side branch until such time that all other pre-split branches have been merged. Then "merge the merge" at that later time. Only do rejoins fairly rarely, based on the schedule of merging other branches. * Ignore the problem: Allow the occasional falsely-disconnected branch roots to be generated by split. Of course, this means --ignore-joins would no longer generate a consistent subtree history, it is hard to examine what actually changed in the false roots, etc. * Perhaps manually use temporary grafts or similar to hide rejoins that would throw it off when doing later splits. [Intentionally not signed off: Poor performance.] --- Testing: Below I include a script I've been using to help with mostly-manual testing. I regularly modify it based on what I'm testing at the moment. It basically creates and manipulates a subtree out of git's own "contrib" directory. It may be convenient to use this on a copy of git's repository instead of the copy in which you are messing with git-subtree (avoiding changing code out from under the script). It may also be convenient to symlink git-subtree to the top-level directory, rather than copy it, so you don't need to keep re-copying it. ------CUT------ #!/bin/sh die() { echo "$@" 1>&2 exit 1 } GIT=../git-sandbox/bin-wrappers/git #GIT=git DBG= #DBG=-d EDIT= #EDIT=--edit git branch -D br-contrib git checkout 73bbc0796b4ce65bfb1a12b47a0edc27845ecf50 || die "checkout" "$GIT" subtree split -P contrib \ --branch br-contrib --squash --rejoin $EDIT || die "subtree 1" git tag -f step1 git merge -m 'MY MERGE1' 68a65f5fe54c2b21bfe16ef3a0b48956ecf5658a || die "merge 1" "$GIT" subtree split -P contrib \ --branch br-contrib --squash --rejoin || die "subtree 2" git tag -f step2 git merge -m 'MY MERGE2' 15f7221686eac053902b906c278680b485c865ce || \ die "merge 2" "$GIT" subtree split -P contrib \ --branch br-contrib --squash --rejoin $EDIT || die "subtree 3" ##### if [ -n "$EDIT" ]; then exit 0 fi git tag -f step3 git merge -m 'MY MERGE3' 26145c9c73ed51bbd8261949d31899d6507519d5 || \ die "merge 3" "$GIT" subtree split -P contrib \ $DBG --branch br-contrib --squash --rejoin || die "subtree 4" git tag -f step4 git merge -m 'MY MERGE4' 583736c0bcf09adaa5621b142d8e43c22354041b || \ die "merge 4" "$GIT" subtree split -P contrib \ --branch br-contrib --squash --rejoin || die "subtree 5" git tag -f step5 ------CUT------ contrib/subtree/git-subtree.sh | 61 +++++++++++++++++++++++++++++++++++++----- 1 file changed, 55 insertions(+), 6 deletions(-) diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh index ac82b4d..6ff4362 100755 --- a/contrib/subtree/git-subtree.sh +++ b/contrib/subtree/git-subtree.sh @@ -36,6 +36,8 @@ PATH=$PATH:$(git --exec-path) require_work_tree +nl=' +' quiet= branch= debug= @@ -224,6 +226,13 @@ try_remove_previous() fi } +try_remove() +{ + if rev_exists "$1"; then + echo "$1" + fi +} + find_latest_squash() { debug "Looking for latest squash ($dir)..." @@ -293,8 +302,8 @@ find_existing_splits() debug " Prior: $main -> $sub" cache_set $main $sub cache_set $sub $sub - try_remove_previous "$main" - try_remove_previous "$sub" + try_remove "$main" + try_remove "$sub" fi main= sub= @@ -303,6 +312,40 @@ find_existing_splits() done } +filter_out_needed_existing() +{ + revs="$1" + existing="$2" + base= + git rev-list --merges --parents $revs --not $existing | \ + sed -e 's/^[^ ]* //' -e "y/ /\\$nl/" | sort | uniq | \ + { + debug "filter_out_needed_existing" + debug " revs: $revs" + debug " existing: $existing" + while read needed; do + debug "loop: $needed" + nextexisting= + for exist in $existing ; do + tmp="$(git merge-base $exist $needed)" + if [ x"$tmp" = x"$exist" -o -z "$tmp" ]; then + nextexisting="$nextexisting $exist" + fi + done + existing=$nextexisting + done + echo "$existing" + } +} + +existing_to_unrevs() +{ + existing="$1" + for exist in $existing ; do + try_remove_previous $exist + done +} + copy_commit() { # We're going to set some environment vars here, so @@ -599,10 +642,16 @@ cmd_split() done fi - if [ -n "$ignore_joins" ]; then - unrevs= - else - unrevs="$(find_existing_splits "$dir" "$revs")" + unrevs= + if [ -z "$ignore_joins" ]; then + existing="$(find_existing_splits "$dir" "$revs")" + if [ -n "$existing" ]; then + debug "existing: $existing" + existing="$(filter_out_needed_existing "$revs" "$existing")" + unrevs="$(existing_to_unrevs "$existing")" + debug "base: $base" + debug "unrevs: $unrevs" + fi fi # We can't restrict rev-list to only $dir here, because some of our -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html