On Fri, Jul 27, 2012 at 2:50 PM, Scott Lovenberg <scott.lovenberg@xxxxxxxxx> wrote: > On Fri, Jul 27, 2012 at 3:42 PM, Scott Lovenberg > <scott.lovenberg@xxxxxxxxx> wrote: >> >> >> On Fri, Jul 27, 2012 at 3:13 PM, Frediano Ziglio >> <frediano.ziglio@xxxxxxxxxx> wrote: >>> >>> Hi, >>> I'm currently trying to support utf-16 with characters not in plane 0. >>> >>> I'm currently end up with this patch. Currently is not against latest >>> kernel but the problem still reside in last git kernel. >>> >>> wchar_t is currently 16bit so converting a utf8 encoded characters not >>> in plane 0 (>= 0x10000) to wchar_t (that is calling char2uni) lead to a >>> -EINVAL return. This patch detect utf8 in cifs_strtoUCS and add special >>> code calling directly utf8_to_utf32. >>> >>> Does it sound a good patch or just a bad hack. Perhaps would be better >>> to change char2uni converting to unicode_t (32bit) instead of wchar_t >>> but probably many code have to be checked in order to make sure it does >>> not lead to wrong conversions, overflows or other bad stuff. >>> >>> Is it worth working in this hacking way? I'd like to upstream this >>> patch. Terminology is confusing. Refreshing my memory by looking at http://en.wikipedia.org/wiki/Universal_Character_Set are we talking about UTF-16 vs. UCS-2 (ie cases where a pair of 16 bit unicode characters are interpreted as one)? IIRC there are a few languages where this helps, at least since Windows XP when apparently it became more common. and we have to support this in kernel. > Just my $0.02, but there are a lot of magic numbers in this patch Agreed. The check vs. 3F and against the maximum unicode value should be against #defined values which are easier to read. -- Thanks, Steve -- To unsubscribe from this list: send the line "unsubscribe linux-cifs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html