Alan Stern wrote:
On Mon, 20 Apr 2009, Eugen Dedu wrote:
I see now the corresponding function, at
http://lxr.linux.no/linux+v2.6.29/drivers/usb/core/message.c#L761. I
looked at several places where it is used, and until now it seems that
changing it to utf-8 has no negative impacts on the other code. Do you
think that it is feasible (easy) to really make this change?
It is feasible, but there are a couple of things to watch out for:
With latin-1 encoding we know that each character occupies
only one byte; therefore any descriptor string will fit into a
128-byte buffer (since the total descriptor length can't be
larger than 255). But with UTF-8 encoding, a character can
occupy more than one byte. Hence the callers may need to
allocate larger buffers than they do now. For instance, you
would definitely want to change usb_cache_string().
Translation from UTF-16LE to latin-1 is easy. Translation
to UTF-8 is harder because it requires you to check for
invalid code points. Furthermore, if you write your own code
to do the translation then you are almost certainly duplicating
code that already exists somewhere else in the kernel, which is
a bad idea.
Thanks for such useful information. Unfortunately, after more careful
code reading, it would be very difficult for me, as I have never
modified the kernel until now (and I am in short of time now). Maybe
later I will work on it to create a patch.
I haven't seen code for this translation. For what is worth, I noticed
a few pages of interest:
http://lxr.linux.no/linux+v2.6.29/fs/nls/nls_base.c#L35
and
http://lxr.linux.no/linux+v2.6.29/fs/nls/nls_base.c#L106
This modification sounds simple, but there are so many functions where
it is called, directly or indirectly.
--
Eugen
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html