Re: [PATCH v2] Allow git mv FileA fILEa on case ignore file systems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Torsten BÃgershausen <tboegi@xxxxxx> writes:

> The typical use case is when a file "FileA" should be renamed into fILEa
> and we are on a case insenstive file system (system core.ignorecase = true).
> Source and destination are the same file, it can be accessed under both names.
> This makes git think that the destination file exists.
> Unless used with --forced, git will refuse the "git mv FileA fILEa".
> This change will allow "git mv FileA fILEa" under the following condition:
> On Linux/Unix/Mac OS X the move is allowed when the inode of the source and
> destination are equal (and they are on the same device).
> This allows renames of MÃRCHEN into MÃrchen on Mac OS X.
> (As a side effect, a file can be renamed to a name which is already
> hard-linked to the same inode).
> On Windows, the function win_is_same_file() from compat/win32/same-file.c
> is used.
> It calls GetFileInformationByHandle() to check if both files are
> "the same".

Yeek; is it just me or the above single block of text too dense to be
readable? Can you use paragraph breaks?

> The typical use case is when a file "FileA" should be renamed into fILEa
> and we are on a case insenstive file system (system core.ignorecase = true).

Huh? I don't think renaming "FileA" to "fILEa" is typical at all. It is
very rarely done.

> (As a side effect, a file can be renamed to a name which is already
> hard-linked to the same inode).

It is unclear "a side effect" means "an added bonus" or "a regression" in
this sentence. I think this is latter.

Allowing filesystem specific logic to detect that two different "names"
actually refer to a single "file" and help renaming succeed is a sane
approach, but I think this particular implementation is flawed.

The important thing to notice is that "names" and "file" above refer to
the entities from the end user's point of view. Two files hardlinked
together on a filesystem with sane pathname handling are never the same
"file". I would probably have called it equivalent_filenames() to stress
the fact that two _different_ names alias to the same file. is_same_file()
sounds more like you got two different filenames from the filesystem
(i.e. readdir() gave you both at the same time) and you are trying to see
if they are the same file, but that is not the case here.

I also find it a bad taste to make this feature depend on win32; doesn't a
Linux box mounting a vfat filesystem have the same issue that we should be
able to solve with the same code?  Can't we instead have a configuration
variable that tells git that the working tree is on a filesystem with
broken pathname semantics, and what kind of workaround is needed?  Isn't
core.ignorecase already that configuration variable for case insensitve
filesystems [*1*]?

I would imagine that the implementation of equivalent_filenames(a,b) may
be !strcmp(a,b) for a sane filesystem [*2*] and !strcasecmp(a,b) for a
case insensitive filesystem.  For a totally wacky filesystem, your
lstat(2) based one might end up to be the best we could do [*3*].

When two different _names_ "A" and "a" refer to a single file, the only
thing that should happen for "git mv A a" is for the cache entry for "A"
to be moved to cache entry for "a", and no rename("A", "a") should be run,
but I don't see any such change in the code. It may happen to work (or be
a no-op) on Windows, but because builtin/mv.c is supposed to be generic
(and that is the reason you introduced the is_same_file() abstraction in
the first place), I'd still see this as a breakage.


[Footnote]

*1* Off the top of my head, perhaps core.ignorecase may have to grow into
boolean plus extra to cover "this is not just case insensitive, but isn't
even case preserving" kind of broken filesystems like HFS+, but I didn't
think things through.

*2* Incidentally, wouldn't "!strcmp(a,b)" solution suddenly start allowing
"git mv Makefile Makefile" that we currently disallow? Is it a regression
(less safety against an unexpected input) or a good change?

*3* If we can find a solution that does not involve any calls to the
filesystem, it would be ideal, as we can reuse it later in codepaths where
neither file "a" or file "b" exists on the filesystem yet (think: "we are
about to create 'a' and 'b'---is that sensible, or will they overwrite
with each other on this filesystem?").
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]