Re: [PATCH 2/3] Introduce rename factorization in diffcore.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 06, 2008 at 05:10:09PM -0800, Junio C Hamano wrote:
> Yann Dirson <ydirson@xxxxxxxxxx> writes:
> 
> > Rename factorization tries to group together files moving from and to
> > identical directories - the most common case being directory renames.
> > We do that by first identifying groups of bulk-moved files, and then
> > hiding those of the individual renames which carry no other
> > information (further name change, or content changes).
> > This feature is activated by the new --factorize-renames diffcore
> > flag.
> 
> I have a mixed feeling about this one, primarily because I cannot
> visualize how a useful output should look like.  Unless you rename one
> directory to another without any content changes, you would have to say
> "this directory changed to that, and among the paths underneath them, this
> file have this content change in addition".

Right.  This is something I have already thought about, without taking
the time to finalize.  So let's have a try.

The direct transposition of the raw output of a dir rename with a
change would yield something like:

|diff --git a/ppc/ b/foo
|similarity index 100%
|rename from ppc/
|rename to foo/
|diff --git a/ppc/sha1.h b/foo/sha1.h
|--- ppc/sha1.h
|+++ foo/sha1.h
|...

This would have the usefulness I'm looking for, in that the content
modifications would not be "hidden" among a whole lot af individual
file rename hunks.

But if we want to make this more generally useful (eg. allow other
tools to apply such patches), we need to be careful.  A typical patch
file can usually be seen as a sequence of changes which can be applied
independently (at least in order).  It is obvious that this is not
true for the above output.  So we could also consider making this
patch output sequential with something like:

|diff --git a/ppc/ b/foo
|similarity index 100%
|rename from ppc/
|rename to foo/
|diff --git a/foo/sha1.h b/foo/sha1.h
|--- foo/sha1.h
|+++ foo/sha1.h
|...

However, such output would possibly be confusing, eg. when looking at
it from gitk and looking for commits which modify ppc/sha1.h.

A solution to this could be to add an annotation such as:

|diff --git a/ppc/ b/foo
|similarity index 100%
|rename from ppc/
|rename to foo/
|diff --git a/foo/sha1.h b/foo/sha1.h
|--- foo/sha1.h (previously ppc/sha1.h)
|+++ foo/sha1.h
|...

That would make the "git diff" format diverge a bit more from the
standard, but since a full-fledged git patch already cannot be used by
standard patch tools, it should not be a big issue - we should just be
careful about choosing a suitable format if we go that way.

But as we already change the patch format, we could also simply say
that we don't care about the patch being "litterally splittable", add
a "this is a non-splittable whole" note at the beginning of the patch
output, and go with the litteral solution shown as first example.

How do you feel wrt this ?


> A related feature that would benefit from something like your change
> without any downside/complication of output format issues is to boost
> rename similarity score of a path when its neighbouring paths are moved to
> the same location.

Yes, that could be triggered by distinct switches.

Doing this would be related to detecting "directory splits" (ie. when
some files are really not moved, or moved into several dirs instead of
being all moved to the same target dir).  We can start by detecting
dir-splits, then shift the scoring threshold and do another pass on
split dirs.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux