Jeff Hostetler <git@xxxxxxxxxxxxxxxxx> writes: > Another thing to keep in mind is that the collision could be because > of case folding (or other such nonsense) on a directory in the path. > I mean, if someone on Linux builds a commit containing: > > a/b/c/D/e/foo.txt > a/b/c/d/e/foo.txt > > we'll get a similar collision as if one of them were spelled "FOO.txt". I'd think the approach to teach checkout_entry() codepath to notice it needed to unlink the existing file in order to check out the entry it wanted to check out would cover this equally well. > Also, do we need to worry about hard-links or symlinks here? I do not think so. You do not get a file with multiple hardlinks in a "git clone" or "git checkout" result, and we do not check things out beyond a symbolic link in the first place. > If checkout populates symlinks, then you might have another collision > opportunity. For example: > > a/b/c/D/e/foo.txt > a/link -> ./b/c/d > a/link/e/foo.txt In other words, a tree with a/link (symlink) and a/link/<anything> that requires a/link to be a symlink and a directory at the same time cannot be created, so you won't get one with "git clone" > Also, some platforms (like the Mac) allow directory hard-links. > Granted, Git doesn't create hard-links during checkout, but the > user might. And we'd report "we are doing a fresh checkout immediately after a clone and saw some file we haven't created, which may indicate a case smashing filesystem glitch (or a competing third-party process creating random files)", so noticing that would be a good thing, I would think. > I'm sure there are other edge cases here that make reporting > difficult; these are just a few I thought of. I guess what I'm > trying to say is that as a first step just report that you found > a collision -- without trying to identify the set existing objects > that it collided with. Yup, I think that is sensible. If it can be done cheaply, i.e. on a filesystem with trustable and cheap inum, after noticing such a collision, go back and lstat() all paths in the index we have checked out so far to see which ones are colliding, it adds useful clue to the report, but noticing the collision in the first place obviously has more value.