Re: Rename handling

Jakub Narebski <jnareb@xxxxxxxxx> · Thu, 22 Mar 2007 03:01:39 +0100

Martin Langhoff wrote:
> On 3/22/07, Steven Grimm <koreth@xxxxxxxxxxxxx> wrote:
>>
>> Say you're tracking a directory full of video files. Even a slight tweak
>> to one of them (to put a logo in the corner, say, while moving it into
>> an "accessible by the public" directory) will result in a file that has
>> no content in common at all if you look at it as purely a stream of
> 
> In that case, tracking the rename is not useful at all from the POV of
> your SCM. The  reason the SCM needs to understand content-movement (of
> which renames are a special type), it to help you as much as possible
> at merge time.
> 
> So - git as an SCM focusses on tracking your content, and helping you
> merge. It does _that_ probably better than any other SCM. So git
> internat data structures care strictly about the stuff that is needed
> for git's operation as an SCM.
> 
> And in the context of helping you merge, explicit rename tracking is a
> red-herring. This point is arguable - Linus said earlier "you can do
> better by tracking content and ignoring explicit renames" and we are
> now getting there in terms of having code that does better.

Additional issue that we have to think about with respect to rename
support for merges is that git uses 3-way merge, taking into account
_only_ upstream commit (of the branch we want to merge to), side branch
commit (of the branch we want to merge) and common ancestor[*1*] 
(merge base) for merging. What is important is that the intermediate
states, how we got to the current state, does not matter.

Well, one could argue that if we remember explicit (provided by user)
info about renames for example in proposed 'note' field of a commit
object, or in other helper structure (we cannot remember the information
in blob or tree), we can gather and remember information about recorded
explicit renames when finding common ancestor...

Although I think it would be better and easier to just provide rerere2
cache to git-rerere to record corrections to rename detection, and use
it in subsequent merges (this was proposed, but IIRC not implemented)...

> Of course in your case the fact that there was a rename is important
> -- for users. This kind of information is not metadata for the SCM but
> for users. So that goes into the commit message, which is freeform. So
> - working with your scenario, if this happens often, I would suggest
> having a pre-commit hook that prepares a nice commit text message
> listing likely renames if they can be sussed out automatically.
> 
> Or having a custom git-mv that collects mv operations and then your
> pre-commit-hook preps your commit message with that manifest of moved
> files.
> 
> Does it make sense? It is data-for-the-user, so it goes in the commit
> msg. If it's data-for-the-SCM machinery, then it goes into the
> tracking data git handles internally.

Still, it would be nice to have --follow=<file> option to git-log family,
besides path limiting. And this could have take use of explicit recording
of renames (much easier than merge can).

References
==========
[*1*] Well, it can be a bit more complicated if there is more than one
common ancestor; git uses recursive merge strategy.

-- 
Jakub Narebski
Poland
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html