Re: non-ascii filenames issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jay Soffian <jaysoffian@xxxxxxxxx> writes:

> On Sun, Apr 5, 2009 at 6:51 AM, John Tapsell <johnflux@xxxxxxxxx> wrote:
>> Unfortunately not, because for some absolutely crazy reason
>
> Bzzt. http://article.gmane.org/gmane.comp.version-control.git/50830

I do not think the message gives enough information on the issue, as "a
pathname is a slash separated sequence of path components terminated with
a NUL, and a path component is an uninterpreted sequence of bytes
excluding NUL and slash" is simply a UNIX tradition the original git
design took as _given_, so the "some absolutely crazy reason" comment does
not even deserve refuting.

There is _no_ reason, crazy or otherwise.  If you start from "a pathname
is an uninterpreted sequence of bytes" tradition, it is a design parameter
and "how things are", and you simply do not argue with them.  And the
message you quoted doesn't, either.

	Side note: I am not saying that we should not ever change that
	particular design parameter.  I am just explaining why 50830 is
	not a good counterargument to quote against the "some absolutely
	crazy reason" accusation.

> And, as always, patches welcomed.

Before patches, you need a sound design and justification.

At least you need to consider the following (the early ones are easier):

 - Do we unify them to some canonical encoding internally and do the
   matching in the canonical space?   What's the internal representation
   (presumably UTF-8)?

 - How should a user tell the pathname conversion rules between the
   internal repreasentation and the filesystem representation to git?  A
   config variable per a repository?

 - How should this interact with patch+apply dataflow (including "rebase"
   without -i/-m)?  Should pathnames in diffs be in canonical form?

 - How should this interact with case challenged and/or unicode corrupting
   filesystems such as NTFS and HFSplus whose creat(), readdir(), and
   stat() contradict with each other?

 - What should happen when the pathname in the canonical representation
   recorded in the history cannot be externalized on a particular
   filesystem?  Does it gracefully degenerate and give some escape hatch,
   and if so how?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux