Re: Checking how many letters are in a string.

tedd <tedd.sperling@xxxxxxxxx> · Wed, 19 Mar 2008 19:28:33 -0400

At 9:29 PM +0200 3/19/08, Dotan Cohen wrote:
I am asking the second question: how many Hebrew characters in a
string that _very_likely_ contains other characters as well. The array
suggestion sounds about what I am doing: checking if each letter is a
Hebrew character.

I will also look into the mb_ functions. I did not know about them
before. Thanks.

Dotan Cohen

Dotan:

It really doesn't make any difference.

If you have a single character that is not ASCII, then it's something 
beyond ASCII and you'll need to use the mb_functions.

Unicode contains all known characters (code points) including ASCII 
with values equal to ASCII -- so there's no problem between code 
points and ASCII.

The beyond ASCII string problem is basically what is a character? We 
all know what an "a" is, but what about "a" with a "~" above it? Is 
it one character or two? If it's a combination of two code points, 
then it's a grapheme.

What about the character "fi" when it's combined? Is it one character 
or two? In this case, it's a ligature and is a single code point.

So, when you are trying to count characters in a string, using ASCII 
based functions won't work because they might count one character as 
two and break the character in two parts. Or, the character might be 
actually two characters, but they should be counted as one. As such, 
mb_functions are designed to work with these types of problems where 
as standard string functions won't.

The easy way to tell IF you should use mb_functions is if all the 
characters you're working with appear in the ASCII table, then 
standard string functions apply. However, if any of the characters 
are not found in ASCII, then you need to go another route.

At least, that's my understanding.

Cheers,

tedd

--
-------
http://sperling.com  http://ancientstones.com  http://earthstones.com

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php