Re: Re: Help: Validate Domain Name by Regular Express

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



At 12:23 PM -0500 1/9/11, Daniel Brown wrote:
On Sun, Jan 9, 2011 at 11:58, tedd <tedd.sperling@xxxxxxxxx> wrote:

 For example --

 http://xn--19g.com

 > -- is square-root dot com. In all browsers except Safari, PUNYCODE is shown
 in the address bar, but in Safari it's shown as ?.com

    Not sure if that's a typo or an issue in translation while the
email was being relayed through the tubes, but ?.com directs to
xn--wqa.com here.

--
</Daniel P. Brown>

Daniel et al:

Translation of Unicode characters by various software programs is unpredictable -- this includes email applications.

While I can send/receive ? (square root) through my email program (Eudora) what your email program displays to you can be (as shown) something completely different. The mapping of the code-points (i.e., square-root) to what your program displays (much like a web site) depends upon how your email program works. If your email program has the correct Char Set and will map it to the what was actually received, then the character will be displayed correctly. If not, then things like ?.com happen.

Unfortunately, this mapping problem has not been of great importance for most applications. As it is now, most applications work for English speaking people and that seems good enough, or so many manufactures think. However, as the "rest of the world" starts using applications (and logging on to the net) it will obviously become more advantageous for manufactures to make their software work correctly for other-than-English languages. Apple is doing that and last year the majority of their income came from overseas (i.e., other than USA).

The mapping of other than English characters was the problem addressed by the IDNS WG, where I added my minor contribution circa 2000. Unfortunately, homographic issues were not resolved by the WG. However, a solution was proposed (I entitled as the "Fruit-loop" solution) which was to color-code (flag) the characters in the address bar of a browser IF the URL contained a mixed Char Set. Unfortunately, that solution was not pursued and instead Browser manufactures choose to show raw PUNYCODE, which was never intended to be seen by the end users. A giant step backwards IMO.

Cheers,

tedd

--
-------
http://sperling.com/

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux