Re: set cookie with non-english

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 3 Oct 2006 01:15:59 +0300, "Ahmad Al-Twaijiry" wrote:

> Hi everyone
> 
> in my PHP code I use the following command to set a cookie with
> non-english word (UTF-8) :
> 
> @setcookie ("UserName",$Check[1]);
> 
> and in my html page I get this cookie using javascript :

[Snipped]

> but the result from writing the cookie using javascript is garbage, I
> don't get the right word !!

   The problem is that JavaScript uses UTF-16, so you
either have to store the cookie as UTF-16 or do your
own UTF-8 decoding in JavaScript.

   For example, consider the string "åäö", containing
the three funny characters in the Swedish language
(åäö). These characters are encoded
as <c3 a5 c3 a4 c3 b6> in UTF-8, and PHP stores these
in the cookie as:

  %C3%A5%C3%A4%C3%B6

Example:

------------------------------------------------------
<?php

  setcookie ('UserName', "\xc3\xa5\xc3\xa4\xc3\xb6");
//  setcookie ('UserName', "åäö");
  header ('Content-Type: text/html; charset=utf-8');

?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
  "http://www.w3.org/TR/html4/strict.dtd";>

<title>UTF-8 flavoured cookies</title>
<p>

<script type="text/javascript">

  document.write(document.cookie);

</script>
------------------------------------------------------

   The unescape() function in JavaScript converts
these characters to the Unicode code points
<00c3 00a5 00c3 00a4 00c3 00b6> which, of course,
is not what you want.

Example:

------------------------------------------------------
<?php

  setcookie ('UserName', "\xc3\xa5\xc3\xa4\xc3\xb6");
//  setcookie ('UserName', "åäö");
  header ('Content-Type: text/html; charset=utf-8');

?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
  "http://www.w3.org/TR/html4/strict.dtd";>

<title>UTF-8 flavoured cookies</title>
<p>

<script type="text/javascript">

  var s = unescape(document.cookie);
  var t = "";
  for (var i = 0; i < s.length; i++) {
    var c = s.charCodeAt(i);
    t += c < 128 ? String.fromCharCode(c) : c.toString(16) + " ";
  }
  document.write(t);

</script>
------------------------------------------------------

   While there are no doubt better ways to solve this,
you /could/ use the unescape() function to convert the
percent-encoded characters to unicode code point, and
then write your own UTF-8 decoder to do the rest.

Example:

(This is an old C function hammered into JavaScript
 shape. It is likely to be a horrible implementation
 in JavaScript. The error checking adds a bit of bloat.
 Note that the utf_8_decode function supports the full
 Unicode range, while JavaScript doesn't. )
------------------------------------------------------
<?php

  setcookie ('UserName', "\xc3\xa5\xc3\xa4\xc3\xb6");
  header ('Content-Type: text/html; charset=utf-8');

?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
  "http://www.w3.org/TR/html4/strict.dtd";>

<title>UTF-8 flavoured cookies</title>
<p>

<script type="text/javascript">

function utf_8_decode (sin)
{
  function octet_count (c)
  {
    var octet_counts = [
    /* c0 */ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
    /* d0 */ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
    /* e0 */ 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
    /* f0 */ 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 0, 0
    ];

    return c < 128 ? 1 :
           c < 192 ? 0 : octet_counts [(c&255)-192];
  }

  var octet0_masks = [ 0x00,0x7f,0x1f,0x0f,0x07,0x03,0x01 ];
  var sout = "";
  var add;
  for (var si = 0; si < sin.length; si += add) {
    var c = sin.charCodeAt(si);
    add = octet_count(c);
    if (si+add <= sin.length) {
      var u = c & octet0_masks[add];
      var ci;
      for (ci = 1; (ci < add) && ((sin.charCodeAt(si+ci)&0xc0) == 0x80);
ci++)
        u = (u<<6) | (sin.charCodeAt(si+ci) & 0x3f);
      if (ci == add) {
        sout += String.fromCharCode (u);
      } else {
        // Invalid UTF-8 sequence. Should probably throw() instead.
        sout += "\ufffd"; // Replacement character.
        add = 1;
      }
    } else {
      // Invalid UTF-8 sequence. Should probably throw() instead.
      sout += "\ufffd"; // Replacement character.
      add = 1;
    }
  }

  return sout;
}

document.write (utf_8_decode(unescape(document.cookie)));

</script>
------------------------------------------------------

> BTW:
> * I also tried the php function setrawcookie and I get the same problem
> * I use <META http-equiv=Content-Type  content="text/html;
> charset=utf-8"> in my page

   The <META> thing might be good for storing pages
on disk, but on the web you should use real HTTP
headers.


  --nfe

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux