Torsten Bögershausen <tboegi@xxxxxx> writes: > +core.precomposedunicode:: > + ... > + When false, file names are handled fully transparent by git, which means > + that file names are stored as decomposed unicode in the repository. I do not think it means any such thing. We just take whatever the platform throws at us and shove that in the repository. On MacOS X with HFS+, it may be decomposed UTF-8, but we do not even try to ensure everything (like the path added by somebody else on a BSD system in a commit that you fetched) is in a particular encoding. > diff --git a/Makefile b/Makefile > index f62ca2a..55ceb10 100644 > --- a/Makefile > +++ b/Makefile > @@ -607,6 +607,7 @@ LIB_H += compat/bswap.h > LIB_H += compat/cygwin.h > LIB_H += compat/mingw.h > LIB_H += compat/obstack.h > +LIB_H += compat/precomposed_utf8.h Micronit. Shouldn't these all be called "precompose_utf8" throughout the patch? We are asking Git "please normalize by precompose any UTF-8 pathnames" when we give the -DPRECOMPOSE_UNICODE C-preprocessor macro, and compat/precompose_utf8.[ch] are to implement the machinery to do so. > diff --git a/compat/precomposed_utf8.c b/compat/precomposed_utf8.c > new file mode 100644 > index 0000000..14bb0ce > --- /dev/null > +++ b/compat/precomposed_utf8.c > @@ -0,0 +1,189 @@ > +/* Converts filenames from decomposed unicode into precomposed unicode. > + Used on MacOS X. > +*/ Micronit. /* * Multi-line comments begin by slash asterisk newline. * and ends with a run of SP to align asterisk, asterisk * and then newline, like this. */ > +#define __PRECOMPOSED_UNICODE_C__ > + > +#include "cache.h" > +#include "utf8.h" > +#include "precomposed_utf8.h" > +#include "stdio.h" You shouldn't need "stdio.h" as you are including "git-compat-util.h" via "cache.h". > diff --git a/compat/precomposed_utf8.h b/compat/precomposed_utf8.h > new file mode 100644 > index 0000000..708a1c6 > --- /dev/null > +++ b/compat/precomposed_utf8.h > ... > +#ifndef __PRECOMPOSED_UNICODE_C__ > +#define dirent dirent_prec_psx > +#define opendir(n) precomposed_utf8_opendir(n) > +#define readdir(d) precomposed_utf8_readdir(d) > +#define closedir(d) precomposed_utf8_closedir(d) > +#define DIR PREC_DIR > +#endif /* __PRECOMPOSED_UNICODE_C__ */ Hrm, this is not wrong per-se, but looks somewhat unwieldy. > +#define __PRECOMPOSED_UNICODE_H__ > +#endif /* __PRECOMPOSED_UNICODE_H__ */ > diff --git a/utf8.c b/utf8.c > index 8acbc66..a544f15 100644 > --- a/utf8.c > +++ b/utf8.c > @@ -433,19 +433,12 @@ int is_encoding_utf8(const char *name) > ... > @@ -478,6 +470,20 @@ char *reencode_string(const char *in, const char *out_encoding, const char *in_e > break; > } > } > + return out; > +} > + > +char *reencode_string(const char *in, const char *out_encoding, const char *in_encoding) > +{ > + iconv_t conv; > + char *out; > + > + if (!in_encoding) > + return NULL; > + conv = iconv_open(out_encoding, in_encoding); > + if (conv == (iconv_t) -1) > + return NULL; > + out = reencode_string_iconv(in, strlen(in), conv); > iconv_close(conv); > return out; > } Much nicer ;-). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html