Re: git-subtree split misbehaviour with a commit having empty ls-tree for the specified subdir

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On 23 Nov 2019, at 3:55 am, Ed Maste <emaste@xxxxxxxxxxx> wrote:
> 
> I encountered an issue while trying to use git subtree with the
> FreeBSD svn->git mirror: I found that when "git subtree split"
> encounters a commit with an empty "git ls-tree" for the subdirectory
> being split, it ends up recording the original parent as the new
> parent in the split history that's being created. This then leads to
> unrelated history appearing in the split subtree.
> 
> Below is a shell script that demonstrates the issue - this is not the
> precise case that I encountered in the FreeBSD repo, but the behaviour
> is identical (and it doesn't take nearly 10 minutes to run). Running
> the script and then "git log" of the commit printed by the final (git
> subtree) command includes the unrelated history in dir2/.
> 
> It looks like this comes from the cache_set "$rev" "$rev" in
> process_split_commit() added in 39f5fff0d53. This is under the
> suspicious-looking "ugly. is there no better way to tell if this is a
> subtree vs. a mainline commit? Does it matter" comment. However, I
> don't yet understand enough of git-subtree's operation to propose a
> fix.
> 
> --repro.sh--
> #!/bin/sh
> 
> rm -rf subrepo-issue
> mkdir -p subrepo-issue
> cd subrepo-issue
> 
> git init .
> mkdir -p dir1 dir2
> touch dir1/file1 dir2/file2
> git add dir1 dir2
> git commit -m 'initial commit'
> echo 'file2' > dir2/file2
> git commit -m 'file2 modified' dir2/file2
> git rm dir1/file1
> git commit -m 'remove file1'
> mkdir -p dir1
> touch dir1/file1
> git add dir1
> git commit -m 'restore file1'
> echo 'file1' > dir1/file1
> git commit -m 'file1 modified' dir1/file1
> git subtree split --prefix=dir1/
> 


The algorithm I am looking at to replace the file based mainline detection is

 - If subtree root is unknown (as on the initial split), everything is mainline.

 - If subtree root is reachable and mainline root is not, it’s a subtree commit 

 - Otherwise, treat as mainline. This will also pick up commits from other subtrees but they hopefully won’t contain the subtree folder. I don’t think there is an unambiguous way to distinguish a subtree merge from a regular merge - the message produced is pretty generic. It may be possible to check reachability of all known subtrees, but that adds a fair bit of complexity.

That leaves us with the question of how to record the empty mainline commits. The most correct result for your repro is probably four commits (add/delete everything/restore/modify), but I can see that falling over in a scenario where deleting a subtree is more like unlinking a library than editing that library to do nothing.

Is it sufficiently correct for your scenario to treat ‘restore file1’ as the initial subtree commit?





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux