Re: Unicode Problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 6 Oct 2006 10:44:55 -0500 (CDT), "Richard Lynch" wrote:

> I don't think MS Word quotes are Unicode, really...
> 
> I think they're just made-up character sets that Microsoft felt like
> using to be incompatible with everybody else...
> 
> Though the %u#### is almost-for-sure and ATTEMPT to apply Unicode
> conversion, that doesn't mean that the original content was really
> Unicode to start with.
> 
> So after you "undo" the Unicode conversion, you've still potentially
> got data on your hands from a proprietary non-standards-based made-up
> software application.
> 
> Apologies in advance if MS Word actually *is* using a standard Unicode
> charset... But I sure doubt it.

   I think you're missing the point. MS Word DOES use
proprietary encodings, but when text is copied from
MS Word and pasted into the browser, it involves a
conversion process. E.g., the bullet (0x95 in cp1250)
will be converted to whatever encoding the web page is
in (0x2022 in a Unicode encoding).

   Whether the conversion is performed by the browser,
some OS glue or some other trickery, witchery or devilry,
is at the moment beyond my scant knowledge.

   How to solve the original posters problem is also
beyond me, as I haven't used AJAX. I tend to prefer the
ol' form submission for my bits and bobs. That way I can
use UTF-8 all way around, and everything just magically
works. It even works fine for JavaScript-challenged
browsers, would you believe.


  --nfe

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux