Robin Rosenberg <robin.rosenberg@xxxxxxxxxx> wrote: > onsdag 25 november 2009 14:47:25 skrev Marc Strapetz: > > I have noticed that jgit converts file paths to UTF-8 when querying the > > repository. ... > > Is this a bug or a misconfiguration of my repository? I'm using jgit > > (commit e16af839e8a0cc01c52d3648d2d28e4cb915f80f) on Windows. > > A bug. > > The problem here is that we need to allow multiple encodings since there > is no reliable encoding specified anywhere. This is a design fault of both Linux and git. git gets a byte sequence from readdir and stores that as-is into the repository. We have no way of knowing what that encoding is. So now everyone touching a Git repository is screwed. > The approach I advocate is > the one we use for handling encoding in general. I.e. if it looks like UTF-8, > treat it like that else fallback. This is expensive however We should try to work harder with the git-core folks to get character set encoding for file names worked out. We might be able to use a configuration setting in the repository to tell us what the proper encoding should be, and if not set, assume UTF-8. > and then we have > all the other issues with case insensitive name and the funny property that > unicode has when it allows characters to be encoding using multiple sequences > of code points as empoloyed by Apple. But as you said, this still doesn't make the Apple normal form any easier. Though if we know we are on such a strange filesystem we might be able to assume the paths in the repository are equally damaged. Or not. -- Shawn. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html