Re: latin-1 encoding

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 20 Apr 2009, Eugen Dedu wrote:

> I see now the corresponding function, at 
> http://lxr.linux.no/linux+v2.6.29/drivers/usb/core/message.c#L761.  I 
> looked at several places where it is used, and until now it seems that 
> changing it to utf-8 has no negative impacts on the other code.  Do you 
> think that it is feasible (easy) to really make this change?

It is feasible, but there are a couple of things to watch out for:

	With latin-1 encoding we know that each character occupies
	only one byte; therefore any descriptor string will fit into a 
	128-byte buffer (since the total descriptor length can't be 
	larger than 255).  But with UTF-8 encoding, a character can 
	occupy more than one byte.  Hence the callers may need to 
	allocate larger buffers than they do now.  For instance, you 
	would definitely want to change usb_cache_string().

	Translation from UTF-16LE to latin-1 is easy.  Translation
	to UTF-8 is harder because it requires you to check for
	invalid code points.  Furthermore, if you write your own code
	to do the translation then you are almost certainly duplicating 
	code that already exists somewhere else in the kernel, which is 
	a bad idea.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux