Re: Regression in git-subtree.sh, introduced in 2.20.1, after 315a84f9aa0e2e629b0680068646b0032518ebed

"Strain, Roger L." <roger.strain@xxxxxxxx> · Mon, 9 Dec 2019 16:19:09 +0000




So it's been quite a while since I made this specific change, but I'll
attach the relevant portion of the diff below. I may be completely
misremembering portions, and apologize in advance. This was based on an
earlier version of the script, and I can see some other changes have
been made since I forked, but perhaps this will still explain what I
tried to do to work around our problem.

Within process_split_commit, there's logic that tries to distinguish
between commits which are mainline and commits which are subtree.
There's even a comment in the relevant section asking "Is there no
better way? Does it matter?" Well, the answer was yes, it mattered,
because we were picking up mainline commits that there before the
initial add of a subtree, and those were getting sucked in as if they
were subtree commits, and then all the remaining hashes were off.

What this change was meant to do was to check for the existence of a
single, known file. We keep a file called "subtrees.csv" in the root of
our mainline repo, and it defines the various subtrees that comprise
the mainline. Therefore, if that file exists, I can say with certainty
that it is a mainline commit. So when that dodgy check comes up, it
checks for the file first, then falls back to the old behavior.

Partial diff follows, feel free to try it out if it sounds like a
similar problem that you're facing. Change the specific filename for
your needs, obviously.

To be clear, this is NOT something I'm submitting for inclusion in the
general release; it's very repo-specific, and I just hope it might help
a fellow soul.

@@ -506,6 +499,20 @@ subtree_for_commit () {
        done
 }
 
+subtree_for_csv () {
+       commit="$1"
+       dir="$2"
+       git ls-tree "$commit" -- "$dir" |
+       while read mode type tree name
+       do
+               assert test "$name" = "$dir"
+               assert test "$type" = "blob" -o "$type" = "commit"
+               test "$type" = "commit" && continue  # ignore
submodules
+               echo $tree
+               break
+       done
+}
+
 tree_changed () {
        tree=$1
        shift
@@ -667,9 +674,17 @@ process_split_commit () {
        if test -z "$tree"
        then
                set_notree "$rev"
-               if test -n "$newparents"
+               subtreescsv=$(subtree_for_csv "$rev" "subtrees.csv")
+               debug "${indentprefix}  subtrees.csv tree is:
$subtreescsv"
+
+               # ugly.  is there no better way to tell if this is a
subtree
+               # vs. a mainline commit?  Does it matter?
+               if test -z "$subtreescsv"
                then
-                       cache_set "$rev" "$rev"
+                       if test -n "$newparents"
+                       then
+                               cache_set "$rev" "$rev"
+                       fi
                fi
                return
        fi


-- 
Roger Strain

-----Original Message-----
From: Ed Maste <emaste@xxxxxxxxxxx>
To: "Strain, Roger L." <roger.strain@xxxxxxxx>
Cc: git@xxxxxxxxxxxxxxx <git@xxxxxxxxxxxxxxx>, marc@xxxxxxx <
marc@xxxxxxx>
Subject: Re: Regression in git-subtree.sh, introduced in 2.20.1, after
315a84f9aa0e2e629b0680068646b0032518ebed
Date: Mon, 09 Dec 2019 06:45:57 -0500

[EXTERNAL EMAIL]

On Mon, 9 Dec 2019 at 09:29, Strain, Roger L. <roger.strain@xxxxxxxx>
wrote:

I've had to further
customize the script for our internal use, and those changes aren't
something that would be useful for the public at large.

Would you describe the sort of problem you have to work around with
custom changes?

I'm starting on a path of trying to fix git-subtree for failures[1]
encountered in a prototype conversion of the FreeBSD repository from
svn to git. The misbehaviour I encounter occurs when split encounters
a commit for which the path being split is empty in 'git ls-tree', and
the commit is actually not a subtree commit. I'm currently
experimenting with hacks to skip specific hashes during the initial
subtree split. On reading your mail I realize I could address my issue
by testing for the existence of a specific file though, which makes me
wonder if the issue you have is similar.

[1] 
https://lore.kernel.org/git/CAPyFy2AsmaxU-BDf_teZJE5hiaVpTSZc8fftnuXPb_4-j7j5Fw@xxxxxxxxxxxxxx/