Re: Re-casing directories on case-insensitive systems

Junio C Hamano <gitster@xxxxxxxxx> · Fri, 11 Jan 2008 16:37:52 -0800

Robin Rosenberg <robin.rosenberg@xxxxxxxxxx> writes:

> Could we just have a lookup table index extension for identifying the 
> duplicates (when checking is enabled using core configuration option #3324)? 
> That table would keep a mapping from a normalized form (maybe include 
> canonical encoding while we're at it) to the actual octet sequence(s) used.

I would agree that the index extension, if we ever are going to
do this, would be the right place to store this information, at
the single repository level.

However, this opens up a can of worms.  What's the canonical key
should be?  If you want to protect yourself from a unicode
normalizing filesystem, you would use one canonicalization,
while if you want to protect from a case losing filesystem you
would use another?  Or do we at the same time downcase and NFD
normalize at the same time and be done with it?

And where should the configuration be stored?  If a project
wants to be interoperable across Linux and vfat, for example,
that canonicalization needs to be enabled in repositories of all
participants, be they on Linux or vfat, so that people on Linux
can be prevented from creating and register two files xt_mark.c
and xt_MARK.c in the same directory, so that people who extract
the source on vfat won't have troubles.

Which means the information needs to be in-tree.  But that
should not be in .gitattributes (which by definition is for
per-path things).
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html