Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle

Shawn Pearce <spearce@xxxxxxxxxxx> · Fri, 20 Oct 2006 16:53:05 -0400

Linus Torvalds <torvalds@xxxxxxxx> wrote:
> On Fri, 20 Oct 2006, Shawn Pearce wrote:
> > 
> > I renamed hundreds of small files in one shot and also did a few
> > hundered adds and deletes of other small XML files.  Git generated
> > a lot of those unrelated adds/deletes as rename/modifies, as their
> > content was very similiar.  Some people involved in the project
> > freaked as the files actually had nothing in common with one
> > another... except for a lot of XML elements (as they shared the
> > same DTD).
> 
> Heh. We can probably tweak the heuristics (one of the _great_ things about 
> content detection is that you can fix it after the fact, unlike the 
> alternative).
> 
> That said, I've personally actually found the content-based similarity 
> analysis to often be quite informative, even when (and perhaps 
> _especially_ when) it ended up showing something that the actual author of 
> the thing didn't intend.
> 
> So yeah, I've seen a few strange cases myself, but they've actually been 
> interesting. Like seeing how much of a file was just a copyright license, 
> and then a file being considered a "copy" just because it didn't actually 
> introduce any real new code.

Aside from that one strange case I just mentioned I've always seen
the strategy to work very well.  Its never done something I didn't
expect and I've never seen copies or that I didn't expect to see,
knowing what the author of the change did.

So even though I had a little bit of trouble with that rename
situation above I'm _very_ happy with the way Git handles renames.

And the truth is that case above really was quite correct: XML is
very verbose.  When 70% of the file is just required XML to frame
the other 30% of the file's payload its not surprising that files
are considered to be similar when they only differ by a little bit
of payload.

-- 
Shawn.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html