Re: [egit-dev] Re: jgit problems for file paths with non-ASCII characters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> But as you said, this still doesn't make the Apple normal form
> any easier.  Though if we know we are on such a strange filesystem
> we might be able to assume the paths in the repository are equally
> damaged.  Or not.

Well, if the git-core folks could standardize on, e.g., composed UTF-8
(rather then just UTF-8), for storing file names in the repository, then
everything should be clear, isn't it?

--
Best regards,
Thomas Singer
=============
syntevo GmbH
http://www.syntevo.com
http://blog.syntevo.com


Shawn O. Pearce wrote:
> Robin Rosenberg <robin.rosenberg@xxxxxxxxxx> wrote:
>> onsdag 25 november 2009 14:47:25 skrev  Marc Strapetz:
>>> I have noticed that jgit converts file paths to UTF-8 when querying the
>>> repository.
> ...
>>> Is this a bug or a misconfiguration of my repository? I'm using jgit
>>> (commit e16af839e8a0cc01c52d3648d2d28e4cb915f80f) on Windows.
>> A bug. 
>>
>> The problem here is that we need to allow multiple encodings since there
>> is no reliable encoding specified anywhere.
> 
> This is a design fault of both Linux and git.  git gets a byte
> sequence from readdir and stores that as-is into the repository.
> We have no way of knowing what that encoding is.  So now everyone
> touching a Git repository is screwed.
> 
>> The approach I advocate is
>> the one we use for handling encoding in general. I.e. if it looks like UTF-8,
>> treat it like that else fallback. This is expensive however
> 
> We should try to work harder with the git-core folks to get character
> set encoding for file names worked out.  We might be able to use a
> configuration setting in the repository to tell us what the proper
> encoding should be, and if not set, assume UTF-8.
> 
>> and then we have
>> all the other issues with case insensitive name and the funny property that
>> unicode has when it allows characters to be encoding using multiple sequences
>> of code points as empoloyed by Apple.
> 
> But as you said, this still doesn't make the Apple normal form
> any easier.  Though if we know we are on such a strange filesystem
> we might be able to assume the paths in the repository are equally
> damaged.  Or not.
> 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]