Re: [PATCH] [RFC] Design for pathname encoding gitattribute [RESEND]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano wrote:
> To support the above scenarios, I think each instance of
> repository needs to be able to say "this path (specified with a
> matching pattern in the filename encoding) should be converted
> this way coming in, and that way going out."  UTF-8 only project
> would have NKC<->NKD on HFS+ partition, and nothing on
> everywhere else.

I think there is another reason to do this - simple sanity.  Two people
adding the same filename should not end up with a different tree ID, if
they for whatever reason ended up entering a differing equivalent
variant of the same Unicode NKC form.

But, that rule of sanity breaks the C semantics sanity, so it must be a
per-project setting.  Not a necessity, but a good feature I think.  It
can be enforced with external scripts/hooks of course.

What happens on the way in and out of the filesystem, I see that as a
side issue.  Once you define what the normalized form is for the
project, then the features should just fall into place without messy
heuristics.  There is also a correct behaviour when faced with
filesystems that have a different idea about who enforces encoding rules
- so long as you can detect what those ideas are :).  It also means that
users can choose to use the same local encoding as their locale, which
might interoperate better with other apps.

The readdir() (case|normalization) tolerance change is good in its own
right, but it's a slightly different scenario, and an independent
question to what is the normalized form.  Of course, on case folding,
unicode normalizing filesystems you'd have to have a mixture of these
settings for sane operation.

On the chicken and egg thing, I guess .gitattributes is too late, you're
right - unless you say that at each directory level, the globbing is
always C.  But I haven't thought about that very hard.  I was just
re-using a mechanism that already exists rather than try to invent
something new.  I do agree with Dscho's point that mixing encodings in a
repository is not necessarily a use case worth catering for.

Sam.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux