Re: Smart Quotes not so smart

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Monday 20 November 2006 18:19, Richard Lynch wrote:

> You are correct.  They are not "real" UTF-8 nor UTF-16 characters.

> Catch the data somewhere in PHP and use the functions from the User
> Contributed code on http://php.net/str_replace to replace the MS Word
> chars with their ASCII or HTML equivalents -- Both versions are in the
> User Contributed notes, plus variations on this theme.
>
> You may have trouble with REAL UTF-8 and UTF-16 charsets, however, as
> I suspect that MS Word smart quotes may "collide" with those charsets
> (codepages?) in a way that makes one indistinguishable from the other.

Actually, I couldn't get any string-replacement techniques to work.  None of 
them seemed to properly catch the characters involved, either in this PHP app 
or in a Perl app I was working on personally at the same time by coincidence.  

However, I discovered that at least part of the problem is at the HTTP level.  
It seems like the data was being corrupted before it even got to the server.  
Although we already had the Content type charset set to UTF-8 in the HTTP 
header, the browser (IE, Firefox, and Konqueror) was still defaulting to 
Latin1/Western, and I believe then *sending* data as that.  When we set a 
<meta> tag to also set the content type and charset, however, the browser 
(all of them) switched into UTF-8 and submitted the data, and then displayed 
the smart quotes correctly (that is, without funky accented characters).  It 
only seemed to work if the browser was set to UTF-8 both to submit the data 
and to read it.  The existing pages remained borked.  

For the time being it seems the meta tag is working, but I'm quite curious as 
to why the browser would listen to that and NOT to the HTTP header.  It also 
still doesn't explain why the string-replace method is still not working, 
even when everything is set to UTF-8.

If anyone has an idea in that regard, please share. :-)

-- 
Larry Garfield			AIM: LOLG42
larry@xxxxxxxxxxxxxxxx		ICQ: 6817012

"If nature has made any one thing less susceptible than all others of 
exclusive property, it is the action of the thinking power called an idea, 
which an individual may exclusively possess as long as he keeps it to 
himself; but the moment it is divulged, it forces itself into the possession 
of every one, and the receiver cannot dispossess himself of it."  -- Thomas 
Jefferson

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux