Re: [PATCH] A new merge stragety 'subtree'.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano <junkio@xxxxxxx> wrote:
> The detection of corresponding subtree is done by comparing the
> pathnames and types in the toplevel of the tree.
> 
> Heuristics galore!  That's the git way ;-).

I have some concerns about the match-tree heuristic you are using here.

For example, it is very common for Java projects to have the same
tree "shape".  Just look at egit/jgit for an example, the three
top level directories are:

	org.spearce.egit.core/
		META-INF/
		build.properties
		plugin.xml
		src/

	org.spearce.egit.ui/
		META-INF/
		build.properties
		plugin.xml
		src/

	org.spearce.jgit
		META-INF/
		src/

If I were to treat the first two as subprojects this new subtree
merge strategy might fail here as it could easily match to the
wrong directory.


What about a different approach?

In a merge of commit#1 (parent project) and commit#2 (subroject)...

We have the set of merge bases readily available.  We just have
to find out in each merge base where the files went from commit#2,
then modify commit#2 to conform to that same shape.

Really that isn't too different from a rename detection.  In other
words do something like the following:

  a) Scan the parents of the merge base B for a commit that is
  in commit#2's ancestory but not commit#1's ancestory, except by
  the merge commit B.  Such a parent must be from the project that
  commit#2 is also from.  For sake of explaining this, lets call
  this parent B^2.

  b) Perform a partial rename-diff between B^2 and B.  The magic
  here is we need to discard any path in B that also appears in
  B^1 and B^2, and that has the same SHA-1 as in B^1, before we do
  the rename-diff.

  c) Find the most common prefix within the renamed files.

  d) Fit commit#2 to use that prefix, and merge.


Here's a real example.  In 67c75759 you merged git-gui.git.
67c75759^1 is from git.git, 67c75759^2 is from git-gui.git.

The stock rename-diff:

  $ git diff-tree --abbrev -r -M --diff-filter=MRD 67c75759^2 67c75759
  :100644 100644 c714d38... d99372a... M  .gitignore
  :100755 100755 8fac8cb... 7a10b60... M  GIT-VERSION-GEN
  :100644 100644 fd82d9d... 5d31e6d... M  Makefile
  :100644 100644 b95a137... b95a137... R100       TODO    git-gui/TODO
  :100755 100755 f5010dd... f5010dd... R100       git-gui.sh      git-gui/git-gui.sh

The problem here is both ^1 and ^2 defines the first three paths,
so we think we modified them in the merge rather than moved them.
But these three files match ^1, as we did not do an evil merge here.
That's why they are showing as modified in this diff.

Now take 67c7 and whack those three files (step b above), and rediff:

  $ C=$(git ls-tree 67c75759 | sed '
          /       .gitignore$/d
          /       GIT-VERSION-GEN$/d
          /       Makefile$/d' | git mktree)
  $ git diff-tree --abbrev -r -M --diff-filter=MRD 67c75759^2 $C
  :100644 100644 c714d38... c714d38... R100       .gitignore      git-gui/.gitignore
  :100755 100755 8fac8cb... 8fac8cb... R100       GIT-VERSION-GEN git-gui/GIT-VERSION-GEN
  :100644 100644 fd82d9d... fd82d9d... R100       Makefile        git-gui/Makefile
  :100644 100644 b95a137... b95a137... R100       TODO    git-gui/TODO
  :100755 100755 f5010dd... f5010dd... R100       git-gui.sh      git-gui/git-gui.sh

Wow, look at that, everything starts with 'git-gui/'!  ;-)

Then we just need to pick the most popular common prefix of all
renamed paths and fit commit#2 to conform to that structure.
Finally we can run the merge through.

The (now functional) pretend object stuff can be useful here,
such as to make $C above so we can pass it off to diffcore.


I think popping off the 'git-gui/' prefix would be the same deal,
only we'd be looking at the old names to determine the prefix to pop,
rather than the new names.

We already do rename detection in merge-recursive.  Slapping an extra
rename pass in front of things when it is invoked as merge-subtree
can't performance hurt that much.

Thoughts?

-- 
Shawn.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]