On Fri, 20 Oct 2006, Shawn Pearce wrote: > > I renamed hundreds of small files in one shot and also did a few > hundered adds and deletes of other small XML files. Git generated > a lot of those unrelated adds/deletes as rename/modifies, as their > content was very similiar. Some people involved in the project > freaked as the files actually had nothing in common with one > another... except for a lot of XML elements (as they shared the > same DTD). Heh. We can probably tweak the heuristics (one of the _great_ things about content detection is that you can fix it after the fact, unlike the alternative). That said, I've personally actually found the content-based similarity analysis to often be quite informative, even when (and perhaps _especially_ when) it ended up showing something that the actual author of the thing didn't intend. So yeah, I've seen a few strange cases myself, but they've actually been interesting. Like seeing how much of a file was just a copyright license, and then a file being considered a "copy" just because it didn't actually introduce any real new code. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html