Re: git-mv redux: there must be something else going on

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In article <ron1-34F9C6.12273203022010@xxxxxxxxxxxxxx>,
 Ron Garret <ron1@xxxxxxxxxxx> wrote:

> In article <alpine.LFD.2.00.1002031436490.1681@xxxxxxxxxxx>,
>  Nicolas Pitre <nico@xxxxxxxxxxx> wrote:
> 
> > On Wed, 3 Feb 2010, Ron Garret wrote:
> > 
> > > So... how *does* git decide when two blobs are different blobs and when 
> > > they are the same blob with mods?  I asked this question before and was 
> > > pointed to the diffcore docs, but that didn't really clear things up.  
> > > That just describes all the different ways git can do diffs, not the 
> > > actual heuristics that git uses to track content.
> > 
> > Yes, those same heuristics are used to make the decision.
> > 
> > |The second transformation in the chain is diffcore-break, and is
> > |controlled by the -B option to the 'git diff-{asterisk}' commands.  
> > |This is used to detect a filepair that represents "complete rewrite" 
> > |and break such filepair into two filepairs that represent delete and
> > |create.
> > |[...]
> > 
> > |This transformation is used to detect renames and copies, and is
> > |controlled by the -M option (to detect renames) and the -C option
> > |(to detect copies as well) to the 'git diff-{asterisk}' commands.  
> > |[...]
> > 
> > Note that you may use the -B, -C, -M and --find-copies-harder arguments 
> > with log as well as diff commands even if there is no actual diff 
> > output.  So the explanation is really in that document even if simple 
> > rename detection is concerned only by a fraction of what is said there.
> > 
> > And Git can detect copied files too.
> > 
> > Those semantics are not stored in the repository so they can be improved 
> > or even changed after the facts.
> 
> OK, on closer reading I see that the information is there, but it's well 
> hidden :-)  (For example, the -M option takes an optional numerical 
> argument so you can tweak how much similarity is needed to be considered 
> a move.  But the docs for git log don't mention this.  It's buried deep 
> in the git diffcore docs.  But yes, it's there.)
> 
> So I think I'm beginning to understand how this works, but that leads me 
> to another question: it seems to me that there are potential screw cases 
> for this purely content-based system of tracking files.  For example, 
> suppose I have a directory full of sample config files, all of which are 
> similar to each other.  Will that cause diffcore to get confused?
> 
> Feel free to treat that as a rhetorical question because obviously I can 
> (and probably should) get the answer by trying it.

Actually, I think the answer is in Avery's post in another branch of 
this thread.

rg

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]