On 2014-04-28 22.03, Jeff King wrote: > On Mon, Apr 28, 2014 at 09:52:07PM +0200, Torsten Bögershausen wrote: > >> To my knowledge repos with decomposed unicode should be rare in >> practice. I only can speak for european (or latin based) or cyrillic >> languages myself: > I've run across several cases in the past few months, but only just > figured out what was going on. Most were tickets to GitHub support, but > we actually have such a case in our github/github repository. In most > cases, I think they were created on older versions of git on OS X, > either before core.precomposeunicode existed, or before it was turned on > by default. The decomposed form got baked into the tree (whatever the > user originally typed, git probably found out about it via "git add ."). > > I think reports are just coming in now because we didn't start turning > on core.precomposeunicode by default until v1.8.5, shipped in November. > And then, a person working on the repository would not notice anything, > since we only set the flag during clone. So it took time for people to > upgrade _and_ to make fresh clones. OK, thanks for the description. In theory we can make Git "composition ignoring" by changing index_file_exists() in name-hash.c. (Both names must be precomposed first and compared then) I don't know how much people are using Git before 1.7.12 (the first version supporting precomposed unicode). Could we simply ask them to upgrade ? The next problem is that people need to agree if the repo should store names in pre- or decomposed form. (My voice is for precomposed) Unfortunatly the core.precomposeunicode is repo-local, so everybody needs to "agree globally" and "configure locally". Side note: I which we had this config variable travelling with the repo, like .gitattributes does for text dealing with CRLF-LF. I don't know how many reports you have, reading all this it feels as if the effected users could "normalize" their repos and run "git config core.precomposeunicode true", followed by "git config --global core.precomposeunicode true". Does that sound like a possible way forward ? >> So for me the test case could sense, even if I think that nobody (TM) >> uses an old Git version under Mac OS X which is not able to handle >> precomposed unicode. > Even when they do not, the decomposed values are baked into history from > those old versions. So it is a matter of history created with older > versions not interacting well with newer versions. I'm not sure if I understood all the details here, but I would be happy to help with suggestions/tests/reviews. > -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html