Re: Git branch capitalisation bug?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 20, 2023 at 12:08 AM Ilya Kamenshchikov
<ikamenshchikov@xxxxxxxxx> wrote:
> I've recollected the history of how this issue occurred more and did
> few more tests. I'm now working with branch named "feature/git_repro":
> 1) The error first occurred when I worked with a colleague on a same
> branch, and he really used capital "Feature/branch".

The bug -- or "feature", depending on how you believe this all
*should* work -- is simple in concept and difficult in
reproduction because there are multiple moving parts.

Here's the concept:

 * In Git, branch names are always case sensitive.

 * But Git *sometimes* relies on the *OS* / file-system to
   implement this.

 * Some OSes / file-systems are case *in*sensitive.

When Git uses a case-INsensitive file-system to store a case-
sensitive branch name component, the OS / file-system loses the
case distinction. Exactly how that happens is up to the OS /
file-system, but we can see how common macOS and Windows systems
do it.

On these systems, when creating a file or directory, the *first*
creation attempt "wins". That is, if any command or process
goes to create a file or directory named "Feature", and no such
file or directory exists *now*, the file or directory is created
with precisely this set of casing. But if the file or directory
*already exists* in any casing (upper and/or lower), the system
uses the existing one: if "feature" exists, that's the name, or
if "featURE" exists, *that* is the name that is used.

So, if and when Git stores a branch name or component as a file-
system file or directory name *and* the system itself imposes this
case-folding match-some-existing-name scheme, Git's case
distinction -- the fact that Git considers "feature" and "Feature"
entirely different names -- is lost. Git is sure these are
different and will stay different, but they aren't and don't.

When Git reads these names back later, it finds the system's
names, rather than the ones Git attempted to store. Git believes
the system's names, rather than its own.

Sometimes, however, Git stores branch names in memory or in
file data, where this kind of case-folding never occurs. During
such periods, feature/git_repo and Feature/git_repo remain
different, distinct branch names.

To reproduce the problem, then, you must:

 1. mix a case-sensitive system (e.g., a typical Linux setup
    as found on GitHub) with a case-insensitive one (e.g., a
    typical Windows or macOS system);

 2. use the case-insensitive one yourself -- using the case-
    sensitive system you will see branch names as they actually
    appear, since they are never converted by the OS / file-system;

 3. set up the problem; and

 4. make sure Git stores the branch names in directories and
    files, rather than in the .git/packed-refs file.

The cure for this would be for Git to stop using the file system's
names directly, the way it does now. There are some long term
projects to make this happen, but little progress has happened
with them.

Until then, the way to avoid the problem is simple:

 A) insist that everyone use the same kind of OS, and/or
 B) be careful not to depend on case differences.

Method (A) tends to be impractical but method (B) is easy: just
make sure all users use all-lowercase all the time, at least for
branch names. It's not very nice, but it's practical.

Chris




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux