Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ernesto A. Fernández 於 2017-11-24 19:45 寫到:
On Fri, Nov 24, 2017 at 03:25:40PM +0800, tchou wrote:
Ernesto A. Fernández 於 2017-11-23 19:32 寫到:
>Hi:
>
>your issue seems to be in the decomposition of hangul characters, not in
>the recomposition before printing. The hfsplus module on linux is saving
>the name of your actor as AC F5 C7 20, without performing any
>decomposition at all.
>
>The reason your patch hides the bug is because it causes linux to present
>filenames as decomposed utf8, so it is not necessary to decompose again
>before working with them. But the issue is still there, and you will most
>likely run into trouble if you make a hangul filename in linux and try
>to work with it in MacOS.
>
>Reviewing the code it would seem that the developers completely forgot
>the hangul characters had their own rules for decomposition. It's weird
>because they did the composition part correctly.
>
>I've made a quick draft of a patch, mostly by copying the code provided
>in the unicode web. I don't think we can actually use it on a release,
>but it should be enough to check if I'm right. It works fine on linux,
>but I don't have a mac, so it would be great if you could test it for me.
>
>Thanks,
>Ernest
>
>(By the way, there is no reason you should have to use the nodecompose
>mount option, as the other reviewer suggested. Using that option will
>have a similar effect to that of your patch. It will hide the problem,
>but if you create a hangul filename on linux with that option you
>probably won't be able to use it on a mac.)
>
>---
>diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
>index dfa90c2..9006c61 100644
>--- a/fs/hfsplus/unicode.c
>+++ b/fs/hfsplus/unicode.c
>@@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block
>*sb, const char *astr, int len,
> 	return size;
> }
>
>-/* Decomposes a single unicode character. */
>+/* Decomposes a single non-Hangul unicode character. */
> static inline u16 *decompose_unichar(wchar_t uc, int *size)
> {
> 	int off;
>@@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc, int
>*size)
> 	return hfsplus_decompose_table + (off / 4);
> }
>
>+/* Decomposes a Hangul unicode character. */
>+int decompose_hangul(wchar_t uc, u16 *result)
>+{
>+	int index;
>+	int l, v, t;
>+
>+	index = uc - Hangul_SBase;
>+	if (index < 0 || index >= Hangul_SCount)
>+		return 0;
>+
>+	l = Hangul_LBase + index / Hangul_NCount;
>+	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
>+	t = Hangul_TBase + index % Hangul_TCount;
>+
>+	result[0] = l;
>+	result[1] = v;
>+	if (t != Hangul_TBase) {
>+		result[2] = t;
>+		return 3;
>+	}
>+	return 2;
>+}
>+
> int hfsplus_asc2uni(struct super_block *sb,
> 		    struct hfsplus_unistr *ustr, int max_unistr_len,
> 		    const char *astr, int len)
>@@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
> 	int size, dsize, decompose;
> 	u16 *dstr, outlen = 0;
> 	wchar_t c;
>+	u16 hangul_buf[3];
>
> 	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags);
> 	while (outlen < max_unistr_len && len > 0) {
> 		size = asc2unichar(sb, astr, len, &c);
>
>-		if (decompose)
>-			dstr = decompose_unichar(c, &dsize);
>-		else
>+		if (decompose) {
>+			/* Hangul is handled separately */
>+			dstr = &hangul_buf[0];
>+			dsize = decompose_hangul(c, dstr);
>+			if (dsize == 0)
>+				/* not Hangul */
>+				dstr = decompose_unichar(c, &dsize);
>+		} else {
> 			dstr = NULL;
>+		}
>+
> 		if (dstr) {
> 			if (outlen + dsize > max_unistr_len)
> 				break;

Hi,

Thank you for your explanation and your draft.
I test four versions of hfsplus module:
--------------------------------
1. origin linux
2. apply my patch
3. mount with nodecompose option
4. apply your patch
--------------------------------

It seems you were very thorough, thank you.

There are the results of the test:
---------------------------------------------------------------------
Linux can ls and cp the file from mac correctly with module 2,3,4.
Mac cannot cp the files correctly which touched by module 1,2,3.
Mac can cp the file correctly which touched by module with your patch.
Mac can ls the files which touch by all modules.
---------------------------------------------------------------------

It seems that everything goes well with your patch. The result is just
like you say, my patch or mount with nodecompose option cannot perform
normally.

That's good to hear. I will submit a patch as soon as I can figure out
the licensing issues. If you want I can credit you as reporter and tester.

OK, thanks a lot.

By the way, are the decompose procedures of function hfsplus_hash_dentry()
and hfsplus_compare_dentry() need to modify as well?

You are correct, good thing you noticed. I'll add that when I resend the
patch.

Thanks,
TCHou




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux