Who is responsible for NFD or NFC formated UTF8 text? PHP, my application or the system-administrator?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, all

Yesterday I ran into a big issue I didn't know about before:

There are many ways in UTF8 to save the same character. This applies
to all characters that can be combined of other characters. An example
for that is the German umlaut ö. In theory it can be saved simply as ö
or it can be saved as o followed by ¨.
I raised a question on stackoverflow on that and got tons of helpful
information.
http://stackoverflow.com/questions/12147410/different-utf-8-signature-for-same-diacritics-umlauts-2-binary-ways-to-write

If you don't know what NFD, NFC and those are, take the time and read
this article http://www.unicode.org/reports/tr15/ or at least take a
view at the figures 3-6.

As you read, I moved a page from a MacOSX Server to a Linux Server.
During this movement the filenames got converted from NFD to NFC.

Now my question is:
Is this a common issue?
What can I do to prevent it in the future?
Who's responsible of taking care of that?
I myself, Wordpress, the system I use, or I as the
system-administrator moving the website?

For example I don't know if Windows f.e. converts every filename to
NFC, but MacOSX (using HFS+) forces filenames to be NFD compliant.

Bye
Simon

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux