On Tue, Nov 28, 2017 at 04:15:18PM +0000, Hin-Tak Leung wrote: > > -------------------------------------------- > On Tue, 28/11/17, Ernesto A. Fernández <ernesto.mnd.fernandez@xxxxxxxxx> wrote: > > > The algorithm is very simple, the best way to > > understand it is just > > looking at the code. I > > don't know the first thing about Korean writing, so > > I don't think I should attempt to explain > > why the decomposition is done > > this way. If > > somebody else is interested in the details, they can > > follow > > the citation in the header comment of > > the decompose_hangul function. > > Apologies for coming into this a bit late. > > A couple of points: > > 1. Hangul canonical composition and decomposition is a separate topic from compositions of latin characters with accents. It is described in > > http://www.unicode.org/reports/tr15/tr15-18.html#Hangul > > among other sources. I am aware of that, that's the reason this patch exists. If you check the commit message, you will see it reads: "This happens because the normalization of Hangul characters is a special case" > 2. I think the mount option is a bit of a red-herring. I think we should just do what Mac OS X does - I think in the tech note it says something about storing things always in the decomposed form or composed form. Ideally we should make the differing mount options no-ops. Mac OS X does not need extra mount options, we shouldn't either. MacOS stores the filenames in the NFD form, that is, decomposed. The problem here was that linux was forgetting to decompose the Hangul. The mount option has nothing to do with this patch, other than the fact that it could later be used to access Hangul filenames potentially stored (mistakenly) without decomposition. It is a very unlikely situation. As to why the mount option exists, I couldn't tell you. It was here before the git tree. It is disabled by default anyway, so it's not bothering anyone. Thanks, Ernest