On Thu, 21 Jun 2007, Jeff King wrote: > On Thu, Jun 21, 2007 at 12:52:11PM +0100, Johannes Schindelin wrote: > > > When there are several candidates for a rename source, and one of them > > has an identical basename to the rename target, take that one. > > That's a reasonable heuristic, but it unfortunately won't match simple > things like: > > i386_widget.c -> arch/i386/widget.c We'e also had things like arch/i386/kernel/pci-pc.c -> arch/i386/kernel/pci/common.c so it's not always the ending of a file that is unchanged, but you still often have some "similarity" of the name (ie the "pci" substring is still common there). So I agree that we can be even better about the heuristics. I don't know how much it *matters* in practice. I do agree with the people who argue that you simply shouldn't depend on these kinds of things, and if you have identical files, and move them around, you really are getting behaviour that doesn't matter. The files are *identical* for christ sake! Following their history, it doesn't matter *which* base you follow, since regardless, they've come to the same point! So in that sense, the current git behaviour is actually perfectly fine. At the same time, I'll argue from a totally theoretical point that the "filename" is obviously part of the data in the tree, and as such, a similarity comparison that takes only the data into account is a bit limited. So while I don't think a user should really care, I also think that keeping the filename as part of the similarity analysis is actually a perfectly logical and valid thing to do withing the git policy of "content is king". The filename *is* part of the content, and it's doubly so when you think about a rename or copy operation, where the whole point of the exercise is as much about the filename as about the data inside the file. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html