tedd wrote:
For example, the Unicode issue was raised during this discussion -- if php doesn't consider the numeric relationship of characters, then I see a big problem waiting in the wings. Because if we're having these types of discussions with just considering 00-7F characters, then I can only guess at what's going to happen when we start considering 000000-FFFFFF code-points.
Now, was that enough said? :-)
I don't think you really understand this. < and > are collation
operators when they operate on strings. They have absolutely nothing to
do with the numeric values of the characters. It just so happens that
in English iso-8859-1 there is a 1:1 relationship between the numeric
values and the collation order, but you can think of that as dumb luck.
To better understand this, I suggest you start reading here:
http://icu.sourceforge.net/userguide/Collate_Intro.html
Note one of the points on that page. That in Lithuanian 'y' falls
between 'i' and 'k'. So even without going into Unicode and just using
low-ascii, you have these issues.
Now, until we get to PHP 6, we don't have decent Unicode support and we
don't have LOCALE-aware operators. You will have to manually use
strcoll() to get them, but that is going to change and you will have the
ICU collation algorithms available and for Unicode strings it will be
automatic. You can still have binary-strings if you don't want
locale-aware collation, of course.
-Rasmus
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php