Re: [egit-dev] Re: jgit problems for file paths with non-ASCII characters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Robin Rosenberg <robin.rosenberg@xxxxxxxxxx> wrote:
> onsdag 25 november 2009 14:47:25 skrev  Marc Strapetz:
> > I have noticed that jgit converts file paths to UTF-8 when querying the
> > repository.
...
> > Is this a bug or a misconfiguration of my repository? I'm using jgit
> > (commit e16af839e8a0cc01c52d3648d2d28e4cb915f80f) on Windows.
> 
> A bug. 
> 
> The problem here is that we need to allow multiple encodings since there
> is no reliable encoding specified anywhere.

This is a design fault of both Linux and git.  git gets a byte
sequence from readdir and stores that as-is into the repository.
We have no way of knowing what that encoding is.  So now everyone
touching a Git repository is screwed.

> The approach I advocate is
> the one we use for handling encoding in general. I.e. if it looks like UTF-8,
> treat it like that else fallback. This is expensive however

We should try to work harder with the git-core folks to get character
set encoding for file names worked out.  We might be able to use a
configuration setting in the repository to tell us what the proper
encoding should be, and if not set, assume UTF-8.

> and then we have
> all the other issues with case insensitive name and the funny property that
> unicode has when it allows characters to be encoding using multiple sequences
> of code points as empoloyed by Apple.

But as you said, this still doesn't make the Apple normal form
any easier.  Though if we know we are on such a strange filesystem
we might be able to assume the paths in the repository are equally
damaged.  Or not.

-- 
Shawn.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]