vdr-1.3.27 and UTF-8

Klaus.Schmidinger at cadsoft.de (Klaus Schmidinger) · Wed Jul 20 18:44:39 2005

Sergei Haller wrote:
> On Wed, 20 Jul 2005, Ludwig Nussel (LN) wrote:
> 
> 
>>Klaus Schmidinger wrote:
>>
>>>[...]
>>>To me, a character is an entity that's always the same size (preferably
>>>one byte). UTF-8 breaks with this, so if you have a string that has,
>>>e.g. a strlen() of 10, you can't be sure that this will be really 10 
>>>printing
>>>characters because there might be some "escaped" characters.
> 
> 
> I think the confusion comes from the assumption that a character is 
> exactly one byte long.
> 
> strlen counts bytes not characters. 
> 
> in utf-8 a character can be up to 4 (or was it 8) bytes long.
> 
> IIRC, there are new functions to count characters (wstrlen, wstrcmp, 
> etc.)

Aren't you confusing this with "wide character" functions?

Klaus