Re: [PATCH/RFC] clone: report duplicate entries on case-insensitive filesystems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff Hostetler <git@xxxxxxxxxxxxxxxxx> writes:

> Another thing to keep in mind is that the collision could be because
> of case folding (or other such nonsense) on a directory in the path.
> I mean, if someone on Linux builds a commit containing:
>
>     a/b/c/D/e/foo.txt
>     a/b/c/d/e/foo.txt
>
> we'll get a similar collision as if one of them were spelled "FOO.txt".

I'd think the approach to teach checkout_entry() codepath to notice
it needed to unlink the existing file in order to check out the
entry it wanted to check out would cover this equally well.

> Also, do we need to worry about hard-links or symlinks here?

I do not think so.  You do not get a file with multiple hardlinks
in a "git clone" or "git checkout" result, and we do not check
things out beyond a symbolic link in the first place.

> If checkout populates symlinks, then you might have another collision
> opportunity.  For example:
>
>     a/b/c/D/e/foo.txt
>     a/link -> ./b/c/d
>     a/link/e/foo.txt

In other words, a tree with a/link (symlink) and a/link/<anything>
that requires a/link to be a symlink and a directory at the same
time cannot be created, so you won't get one with "git clone"

> Also, some platforms (like the Mac) allow directory hard-links.
> Granted, Git doesn't create hard-links during checkout, but the
> user might.

And we'd report "we are doing a fresh checkout immediately after a
clone and saw some file we haven't created, which may indicate a
case smashing filesystem glitch (or a competing third-party process
creating random files)", so noticing that would be a good thing, I
would think.

> I'm sure there are other edge cases here that make reporting
> difficult; these are just a few I thought of.  I guess what I'm
> trying to say is that as a first step just report that you found
> a collision -- without trying to identify the set existing objects
> that it collided with.

Yup, I think that is sensible.  If it can be done cheaply, i.e. on a
filesystem with trustable and cheap inum, after noticing such a
collision, go back and lstat() all paths in the index we have
checked out so far to see which ones are colliding, it adds useful
clue to the report, but noticing the collision in the first place
obviously has more value.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux