On Fri, Nov 24, 2017 at 03:25:40PM +0800, tchou wrote: > Ernesto A. Fernández 於 2017-11-23 19:32 寫到: > >Hi: > > > >your issue seems to be in the decomposition of hangul characters, not in > >the recomposition before printing. The hfsplus module on linux is saving > >the name of your actor as AC F5 C7 20, without performing any > >decomposition at all. > > > >The reason your patch hides the bug is because it causes linux to present > >filenames as decomposed utf8, so it is not necessary to decompose again > >before working with them. But the issue is still there, and you will most > >likely run into trouble if you make a hangul filename in linux and try > >to work with it in MacOS. > > > >Reviewing the code it would seem that the developers completely forgot > >the hangul characters had their own rules for decomposition. It's weird > >because they did the composition part correctly. > > > >I've made a quick draft of a patch, mostly by copying the code provided > >in the unicode web. I don't think we can actually use it on a release, > >but it should be enough to check if I'm right. It works fine on linux, > >but I don't have a mac, so it would be great if you could test it for me. > > > >Thanks, > >Ernest > > > >(By the way, there is no reason you should have to use the nodecompose > >mount option, as the other reviewer suggested. Using that option will > >have a similar effect to that of your patch. It will hide the problem, > >but if you create a hangul filename on linux with that option you > >probably won't be able to use it on a mac.) > > > >--- > >diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c > >index dfa90c2..9006c61 100644 > >--- a/fs/hfsplus/unicode.c > >+++ b/fs/hfsplus/unicode.c > >@@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block > >*sb, const char *astr, int len, > > return size; > > } > > > >-/* Decomposes a single unicode character. */ > >+/* Decomposes a single non-Hangul unicode character. */ > > static inline u16 *decompose_unichar(wchar_t uc, int *size) > > { > > int off; > >@@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc, int > >*size) > > return hfsplus_decompose_table + (off / 4); > > } > > > >+/* Decomposes a Hangul unicode character. */ > >+int decompose_hangul(wchar_t uc, u16 *result) > >+{ > >+ int index; > >+ int l, v, t; > >+ > >+ index = uc - Hangul_SBase; > >+ if (index < 0 || index >= Hangul_SCount) > >+ return 0; > >+ > >+ l = Hangul_LBase + index / Hangul_NCount; > >+ v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount; > >+ t = Hangul_TBase + index % Hangul_TCount; > >+ > >+ result[0] = l; > >+ result[1] = v; > >+ if (t != Hangul_TBase) { > >+ result[2] = t; > >+ return 3; > >+ } > >+ return 2; > >+} > >+ > > int hfsplus_asc2uni(struct super_block *sb, > > struct hfsplus_unistr *ustr, int max_unistr_len, > > const char *astr, int len) > >@@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb, > > int size, dsize, decompose; > > u16 *dstr, outlen = 0; > > wchar_t c; > >+ u16 hangul_buf[3]; > > > > decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags); > > while (outlen < max_unistr_len && len > 0) { > > size = asc2unichar(sb, astr, len, &c); > > > >- if (decompose) > >- dstr = decompose_unichar(c, &dsize); > >- else > >+ if (decompose) { > >+ /* Hangul is handled separately */ > >+ dstr = &hangul_buf[0]; > >+ dsize = decompose_hangul(c, dstr); > >+ if (dsize == 0) > >+ /* not Hangul */ > >+ dstr = decompose_unichar(c, &dsize); > >+ } else { > > dstr = NULL; > >+ } > >+ > > if (dstr) { > > if (outlen + dsize > max_unistr_len) > > break; > > Hi, > > Thank you for your explanation and your draft. > I test four versions of hfsplus module: > -------------------------------- > 1. origin linux > 2. apply my patch > 3. mount with nodecompose option > 4. apply your patch > -------------------------------- It seems you were very thorough, thank you. > There are the results of the test: > --------------------------------------------------------------------- > Linux can ls and cp the file from mac correctly with module 2,3,4. > Mac cannot cp the files correctly which touched by module 1,2,3. > Mac can cp the file correctly which touched by module with your patch. > Mac can ls the files which touch by all modules. > --------------------------------------------------------------------- > > It seems that everything goes well with your patch. The result is just > like you say, my patch or mount with nodecompose option cannot perform > normally. That's good to hear. I will submit a patch as soon as I can figure out the licensing issues. If you want I can credit you as reporter and tester. > > By the way, are the decompose procedures of function hfsplus_hash_dentry() > and hfsplus_compare_dentry() need to modify as well? You are correct, good thing you noticed. I'll add that when I resend the patch. > Thanks, > TCHou