On pathnames

Junio C Hamano <gitster@xxxxxxxxx> · Thu, 24 Jan 2008 13:02:54 -0800

One of Linus's recent patch introduces an index hashtable so
that we can later hash "equivalent" names into the same bucket
to allow us non-byte-by-byte comparison.

Before going further, I needed to formalize what we are trying
to achieve.  I learned a few things from the long flamewar
thread, but it is very inefficient to go back to the thread to
pick only the useful pieces.  The whole flamewar simply did not
fit a small Panda brain.

That was the reason for this write-up.

Design constraints.  In the following, I'll use two names $A and
$B as an example.  They are a pair of names that are considered
equivalent in some contexts, such as:

        A=xt_connmark.c  B=xt_CONNMARK.c

 (1) Some filesystems prevent you from having these two
     (confusing) paths in a directory at the same time.  Some do
     not implement this confusion prevention, and allows both
     names to exist at the same time.

     Let's call the former "case insensitive", and the latter
     "case sensitive".

 (2) readdir(3) on some "case insensitive" filesystems returns
     $A, after a successful creat(2) of $B.  Others remember
     which one of the two "equivalent" names were used in
     creat(2).

     Let's call the former "case folding", and the latter "case
     preserving".

     We assume open(2) or lstat(2) of $A or $B will succeed
     after allowing creat(2) of $B if a case folding filesystem
     returns $A from readdir(3).

 (3) Among the "case folding" ones, some filesystems fold the
     pathname to a form that is less interoperable with other
     systems, and/or the form that is likely to be different
     from what the end-user usually enters.

     Such filesystems are "inconveniently case folding".

The last one is not quite apparent with the "xt_connmark.c"
example, but if you replace $A and $B in the above description
with:

        A=Ma"rchen       B=Märchen

it would hopefully become more clear.

For example, vfat is generally "case preserving".  In that long
flamewar thread, I think we learned that HFS+ is in general
"inconveniently case folding" with respect to Unicode, by always
folding to $A but the keyboard/IM input is more likely to come
as $B, which happens to be the more interoperable form with
other systems.

Issues with case insensitive filesystems
----------------------------------------

At the data structure level, a pathname to git is a sequence of
bytes terminated with NUL.  This will _not_ change.

By the way, at the data structure level, a tree entry in git can
represent a blob that is a symbolic link.  A tree entry in git
can also represent a blob that is a regular file, and in that
case, it can represent if it is executable or not.  These will
also not change.

Now, let's think about how we allow use of git on a filesystem
that is incapable of symbolic links, and/or a filesystem that
does not have trustable executable bit.

We do not say "Symlinks are evil and not supported everywhere,
so let's introduce a project configuration to disallow addition
of symlinks".  We do not say that to the executable bit, either.

Instead, we have fallback methods to allow manipulating symlinks
and executable bit on such a filesystem that is incapable of
handling them natively.

We should be able to do the same for this "case sensitivity"
issue.  A tree that has xt_connmark.c and xt_CONNMARK.c at the
same time cannot be checked out on a case insensitive filesystem.

The filesystem is simply incapable of it (please just calmly
rephrase it in your head as "does not allow such confusing
craziness" instead of starting another flamewar, if you feel the
expression "incapable of" insults your favorite filesystem).

That may mean the project should avoid such equivalent names in
its trees (and having a project wide configuration could be a
technical means to help enforcing that policy), but it does not
mean the core level of git should prevent them to be created on
such systems.  It just means that there should be a way, that
could (and sometimes has to) be different from the "natural"
way, to manipulate such tree entries even on a case insensitive
filesystem.

For example, if I find that RelNotes symlink incorrectly points
at Documentation/RelNotes-1.5.44.txt and want to fix it and push
it out immediately, but if I am on the road and the only
environment I can borrow is a git installation on a filesystem
that is symlink-challenged, I can still do the fix. On such a
filesystem, a symlink is checked out as a regular file but is
still marked as a symlink in the index.  The only thing I need
to do is to edit the file (making sure not to add an extra LF at
the end) and add it to the index.  That's certainly different
from the "natural" way to do that on a filesystem with symlinks,
which is "ln -fs Documentation/RelNotse-1.5.4.txt RelNotes", but
the point is that we make it possible.

The same thing should apply to two files that cannot be checked
out at the same time on case insensitive filesystems.  Perhaps
we could have something like:

	$ git show :xt_CONNMARK.c >xt_connmark-1.c
        $ edit xt_connmark-1.c
	$ git add --as xt_CONNMARK.c xt_connmark-1.c

Issues with case folding filesystems
------------------------------------

In addition to the above, case folding filesystems additionally
have an issue even when there is no "confusing" names in the
tree.  The project may want to have "Märchen" (but not
"Ma"rchen"), but a checkout (which is creat(2) of "Märchen" --
because that is the byte sequence recorded in tree objects and
the index) will result in "Ma"rchen" and no "Märchen" (hence
readdir(3) returns "Ma"rchen").

Linus's patch to use a hashtable that links "equivalent" names
together is a step in the right direction to address this.  The
tree (and the index) has name $B, we check out and the
filesystem folds it to $A.  When we get the name $A back from
the filesystem (via readdir(3)), we hash the name using a hash
function that would drop names $A and $B into the same bucket,
and compare that name $A with each hash entry using a comparison
that considers $A and $B are equivalent.  If we find one, then
we keep the name $B we have already.

If it is a new file, we won't find any name that is equivalent
to $A in the index, and we use the name $A obtained from
readdir(3).

BUT with a twist.

If the filesystem is known to be inconveniently case folding, we
are better off registering $B instead of $A (assuming we can
convert from $A to $B).

One bad issue during development is that we cannot sanely
emulate case folding behaviour on non case-folding filesystems
without wrapping open(2), lstat(2), and friends, because of the
assumption we made above in (2) where we defined the term "case
folding".  This means that the codepath to deal with case
folding filesystems inevitably are harder to debug.

Tasks
-----

 - Identify which case folding filesystems need to be supported,
   and make sure somebody understands its folding logic;

 - For each supported case folding logic, these are needed:

   - a hash function that throws "equivalent" names in the same
     bucket, to be used in Linus's patch;

   - a compare function to determine equivalent names;

   - a convert function that takes a possibly inconvenient form
     of equivalent name (i.e. $A above) as input and returns
     more convenient form (i.e. $B above)

 - Identify places that we use the names obtained from places
   other than the index and tree.  From these places, we would
   need to call the convert function to (de)mangle the name
   before they hit the index.

   Because we may be getting driven by something like:

	$ find | xargs git-foo

   handling readdir(3) we do ourselves any specially does not
   make much sense.  Any path from the user is suspect.

 - Identify places that we look for a name in the index, and
   perform equivalent comparison instead of memcmp(3) we
   traditionally did.  Linus's patch gives scaffolding for this.

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html