Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2017-11-23 at 08:32 -0300, Ernesto A. Fernández wrote:
> Hi:
> 
> your issue seems to be in the decomposition of hangul characters, not
> in
> the recomposition before printing. The hfsplus module on linux is
> saving
> the name of your actor as AC F5 C7 20, without performing any
> decomposition at all.
> 
> The reason your patch hides the bug is because it causes linux to
> present
> filenames as decomposed utf8, so it is not necessary to decompose
> again
> before working with them. But the issue is still there, and you will
> most
> likely run into trouble if you make a hangul filename in linux and
> try
> to work with it in MacOS.
> 
> Reviewing the code it would seem that the developers completely
> forgot
> the hangul characters had their own rules for decomposition. It's
> weird
> because they did the composition part correctly.
> 
> I've made a quick draft of a patch, mostly by copying the code
> provided
> in the unicode web. I don't think we can actually use it on a 


Could you please share the link for "the unicode web"?

Thanks,
Vyacheslav Dubeyko.


> release,
> but it should be enough to check if I'm right. It works fine on
> linux,
> but I don't have a mac, so it would be great if you could test it for
> me.
> 
> Thanks,
> Ernest
> 
> (By the way, there is no reason you should have to use the
> nodecompose
> mount option, as the other reviewer suggested. Using that option will
> have a similar effect to that of your patch. It will hide the
> problem,
> but if you create a hangul filename on linux with that option you
> probably won't be able to use it on a mac.)
> 
> ---
> diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c
> index dfa90c2..9006c61 100644
> --- a/fs/hfsplus/unicode.c
> +++ b/fs/hfsplus/unicode.c
> @@ -272,7 +272,7 @@ static inline int asc2unichar(struct super_block
> *sb, const char *astr, int len,
>  	return size;
>  }
>  
> -/* Decomposes a single unicode character. */
> +/* Decomposes a single non-Hangul unicode character. */
>  static inline u16 *decompose_unichar(wchar_t uc, int *size)
>  {
>  	int off;
> @@ -296,6 +296,29 @@ static inline u16 *decompose_unichar(wchar_t uc,
> int *size)
>  	return hfsplus_decompose_table + (off / 4);
>  }
>  
> +/* Decomposes a Hangul unicode character. */
> +int decompose_hangul(wchar_t uc, u16 *result)
> +{
> +	int index;
> +	int l, v, t;
> +
> +	index = uc - Hangul_SBase;
> +	if (index < 0 || index >= Hangul_SCount)
> +		return 0;
> +
> +	l = Hangul_LBase + index / Hangul_NCount;
> +	v = Hangul_VBase + (index % Hangul_NCount) / Hangul_TCount;
> +	t = Hangul_TBase + index % Hangul_TCount;
> +
> +	result[0] = l;
> +	result[1] = v;
> +	if (t != Hangul_TBase) {
> +		result[2] = t;
> +		return 3;
> +	}
> +	return 2;
> +}
> +
>  int hfsplus_asc2uni(struct super_block *sb,
>  		    struct hfsplus_unistr *ustr, int max_unistr_len,
>  		    const char *astr, int len)
> @@ -303,15 +326,23 @@ int hfsplus_asc2uni(struct super_block *sb,
>  	int size, dsize, decompose;
>  	u16 *dstr, outlen = 0;
>  	wchar_t c;
> +	u16 hangul_buf[3];
>  
>  	decompose = !test_bit(HFSPLUS_SB_NODECOMPOSE,
> &HFSPLUS_SB(sb)->flags);
>  	while (outlen < max_unistr_len && len > 0) {
>  		size = asc2unichar(sb, astr, len, &c);
>  
> -		if (decompose)
> -			dstr = decompose_unichar(c, &dsize);
> -		else
> +		if (decompose) {
> +			/* Hangul is handled separately */
> +			dstr = &hangul_buf[0];
> +			dsize = decompose_hangul(c, dstr);
> +			if (dsize == 0)
> +				/* not Hangul */
> +				dstr = decompose_unichar(c, &dsize);
> +		} else {
>  			dstr = NULL;
> +		}
> +
>  		if (dstr) {
>  			if (outlen + dsize > max_unistr_len)
>  				break;



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux