Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 24, 2017 at 03:25:40PM +0800, tchou wrote:
> Ernesto A. Fernández 於 2017-11-23 19:32 寫到:
> >Hi:
> >
> >your issue seems to be in the decomposition of hangul characters, not in
> >the recomposition before printing. The hfsplus module on linux is saving
> >the name of your actor as AC F5 C7 20, without performing any
> >decomposition at all.
> >
> >The reason your patch hides the bug is because it causes linux to present
> >filenames as decomposed utf8, so it is not necessary to decompose again
> >before working with them. But the issue is still there, and you will most
> >likely run into trouble if you make a hangul filename in linux and try
> >to work with it in MacOS.
> >
> >Reviewing the code it would seem that the developers completely forgot
> >the hangul characters had their own rules for decomposition. It's weird
> >because they did the composition part correctly.
> >
> >I've made a quick draft of a patch, mostly by copying the code provided
> >in the unicode web. I don't think we can actually use it on a release,
> >but it should be enough to check if I'm right. It works fine on linux,
> >but I don't have a mac, so it would be great if you could test it for me.
> >
> >Thanks,
> >Ernest
> >
> >(By the way, there is no reason you should have to use the nodecompose
> >mount option, as the other reviewer suggested. Using that option will
> >have a similar effect to that of your patch. It will hide the problem,
> >but if you create a hangul filename on linux with that option you
> >probably won't be able to use it on a mac.)
> >
> >---
> >diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> >index dfa90c2..9006c61 100644
> >--- a/fs/hfsplus/unicode.c
> >+++ b/fs/hfsplus/unicode.c
> >@@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block
> >*sb, const char *astr, int len,
> > 	return size;
> > }
> >
> >-/* Decomposes a single unicode character. */
> >+/* Decomposes a single non-Hangul unicode character. */
> > static inline u16 *decompose_unichar(wchar_t uc, int *size)
> > {
> > 	int off;
> >@@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc, int
> >*size)
> > 	return hfsplus_decompose_table + (off / 4);
> > }
> >
> >+/* Decomposes a Hangul unicode character. */
> >+int decompose_hangul(wchar_t uc, u16 *result)
> >+{
> >+	int index;
> >+	int l, v, t;
> >+
> >+	index = uc - Hangul_SBase;
> >+	if (index < 0 || index >= Hangul_SCount)
> >+		return 0;
> >+
> >+	l = Hangul_LBase + index / Hangul_NCount;
> >+	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
> >+	t = Hangul_TBase + index % Hangul_TCount;
> >+
> >+	result[0] = l;
> >+	result[1] = v;
> >+	if (t != Hangul_TBase) {
> >+		result[2] = t;
> >+		return 3;
> >+	}
> >+	return 2;
> >+}
> >+
> > int hfsplus_asc2uni(struct super_block *sb,
> > 		    struct hfsplus_unistr *ustr, int max_unistr_len,
> > 		    const char *astr, int len)
> >@@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
> > 	int size, dsize, decompose;
> > 	u16 *dstr, outlen = 0;
> > 	wchar_t c;
> >+	u16 hangul_buf[3];
> >
> > 	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> > 	while (outlen < max_unistr_len && len > 0) {
> > 		size = asc2unichar(sb, astr, len, &c);
> >
> >-		if (decompose)
> >-			dstr = decompose_unichar(c, &dsize);
> >-		else
> >+		if (decompose) {
> >+			/* Hangul is handled separately */
> >+			dstr = &hangul_buf[0];
> >+			dsize = decompose_hangul(c, dstr);
> >+			if (dsize == 0)
> >+				/* not Hangul */
> >+				dstr = decompose_unichar(c, &dsize);
> >+		} else {
> > 			dstr = NULL;
> >+		}
> >+
> > 		if (dstr) {
> > 			if (outlen + dsize > max_unistr_len)
> > 				break;
> 
> Hi,
> 
> Thank you for your explanation and your draft.
> I test four versions of hfsplus module:
> --------------------------------
> 1. origin linux
> 2. apply my patch
> 3. mount with nodecompose option
> 4. apply your patch
> --------------------------------

It seems you were very thorough, thank you.
 
> There are the results of the test:
> ---------------------------------------------------------------------
> Linux can ls and cp the file from mac correctly with module 2,3,4.
> Mac cannot cp the files correctly which touched by module 1,2,3.
> Mac can cp the file correctly which touched by module with your patch.
> Mac can ls the files which touch by all modules.
> ---------------------------------------------------------------------
> 
> It seems that everything goes well with your patch. The result is just
> like you say, my patch or mount with nodecompose option cannot perform
> normally.

That's good to hear. I will submit a patch as soon as I can figure out
the licensing issues. If you want I can credit you as reporter and tester.

> 
> By the way, are the decompose procedures of function hfsplus_hash_dentry()
> and hfsplus_compare_dentry() need to modify as well?

You are correct, good thing you noticed. I'll add that when I resend the
patch.

> Thanks,
> TCHou



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux