Re: [PATCH] [RFC] Design for pathname encoding gitattribute [RESEND]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Mark Junker <mjscod@xxxxxx> writes:

> Just to sum up what you wrote and to be sure that I understand you
> correctly:
>
> Lets have two encodings:
> - Encoding for path names stored in the repository
> - Encoding for path names from/to file systems
>
> Do conversion only if they are different. Both encodings are configurable.

Not really.

 1. Encoding for the project does not have to be specified at
    all.  The project participants are expected to know about it
    out of band.

 2. Conversion for path names between filesystems and the
    project (i.e. "paths in tree objects") can be specified per
    repository (i.e. "a particular clone of the project").  We
    could even allow the conversion function to be different
    per-path-component but I suspect that would be a much
    future addition that nobody would use in practice.

 3. Suggest use of UTF-8-NFC as the project encoding as a BCP,
    but never enforce it.  It is a responsibility of the owner
    of the particular repository to make sure that the
    conversions used in a particular repository (again, "a
    particular clone of the project") produces the desired
    encoding in the tree objects.

But please take these with a moderately large grain of salt, as
I was more or less handwaving and pretending to know what I was
talking about ;-).  I think this should work in theory, but I at
the same time suspect that there are many more places than just
readdir(3) that need to be wrapped if we take this approach, and
the intrusiveness factor might make this infeasible in practice.

The difference between your version and my 1. and 2. is very
subtle, but comes primarily from my desire not to have to use
the word "canonical".  Yours define "this canonical encoding is
used in the repository, and we convert back and forth to that
local encoding", as opposed to my saying "here are to and from
conversion functions".  The latter is more in line with how we
define smudge/clean filters for blob contents conversion, in
that the "encoding" used in in-repository blob does not have to
even have a name.



-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux