On Fri, Apr 24, 2009 at 10:10:46AM +0200, Clemens Ladisch wrote: > Alan Stern wrote: > > It is feasible, but there are a couple of things to watch out for: > > > > With latin-1 encoding we know that each character occupies > > only one byte; therefore any descriptor string will fit into a > > 128-byte buffer (since the total descriptor length can't be > > larger than 255). But with UTF-8 encoding, a character can > > occupy more than one byte. Hence the callers may need to > > allocate larger buffers than they do now. For instance, you > > would definitely want to change usb_cache_string(). > > That one is the only caller of usb_string() in the kernel that uses a > buffer larger than 64 bytes, so I didn't bother about the others. > > > Translation from UTF-16LE to latin-1 is easy. Translation > > to UTF-8 is harder because it requires you to check for > > invalid code points. Furthermore, if you write your own code > > to do the translation then you are almost certainly duplicating > > code that already exists somewhere else in the kernel, which is > > a bad idea. > > The only existing code I've found is utf8_wcstombs(), and it doesn't > bother about invalid code points. > > I've included the NLS patches here because there doesn't seem to be an > NLS maintainer, and you wouldn't want to use the USB patch without those > fixes. > > Not much tested, because I don't have a USB device with non-ASCII > strings. And I'm not quite sure how applications will handle the > encoding change ... Hm, I have a device with an extended ascii string: $ cat /sys/kernel/debug/usb/devices | grep Track S: Product=Microsoft Trackball Optical� so I'll try them out. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html