vdr-1.3.27 and UTF-8

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 20 Jul 2005, Ludwig Nussel (LN) wrote:

> Klaus Schmidinger wrote:
> > [...]
> > To me, a character is an entity that's always the same size (preferably
> > one byte). UTF-8 breaks with this, so if you have a string that has,
> > e.g. a strlen() of 10, you can't be sure that this will be really 10 
> > printing
> > characters because there might be some "escaped" characters.

I think the confusion comes from the assumption that a character is 
exactly one byte long.

strlen counts bytes not characters. 

in utf-8 a character can be up to 4 (or was it 8) bytes long.

IIRC, there are new functions to count characters (wstrlen, wstrcmp, 
etc.)

c ya
        Sergei
-- 
--------------------------------------------------------------------  -?)
         eMail:       Sergei.Haller@xxxxxxxxxxxxxxxxxxx               /\\
-------------------------------------------------------------------- _\_V
Be careful of reading health books, you might die of a misprint.
                -- Mark Twain


[Index of Archives]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Util Linux NG]     [Xfree86]     [Big List of Linux Books]     [Fedora Users]     [Fedora Women]     [ALSA Devel]     [Linux USB]

  Powered by Linux