Re: ampersand in dom with utf-8

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



maybe i should have said: è is not an _xml_ entity.
i m not very sure.
sorry.

`è' is an html entity,
represents the letter `è' in iso-8859-1 charset,
which have ascii value of 0xe8 .

to have it recognized by libxml, there are 3 ways to do this:
1, <?xml version="1.0"><item_name>&#e8;</item_name>
2, <?xml version="1.0" encoding="iso-8859-1"><item_name>è</item_name>
3, <?xml version="1.0"><item_name>è</item_name>

1 can be saved using either utf-8 encoding or iso-8859-1 encoding;
2 must be saved using iso-8859-1 encoding
3 must be saved using utf-8 encoding ( to have `è' be converted properly)


in php, we can do this:

   $html = html_entity_decode('<item_name>farm lettuces with reed
avocado, cr&egrave;me
fra&icirc;che, radish and cilantro</item_name>');
   $dom = DomDocument::loadXML("<?xml version=\"1.0\"
encoding=\"iso-8859-1\">$html");



On 10/13/05, Marcus Bointon <marcus@xxxxxxxxxxxxxxxxxx> wrote:
> On 13 Oct 2005, at 07:24, cc wrote:
>
> > both `&egrave;' and `&icirc;' are not entities in charset utf-8, use
> > `&amp;egrave;' and `&amp;icirc;' instead.
>
> I would expect that to result in unconverted entities in the output.
> If you're intending to send that content as HTML, then I guess that
> would be OK. However, if you're using UTF-8 anyway, why not just use
> the real characters?
>
> Marcus
> --
> Marcus Bointon
> Synchromedia Limited: Putting you in the picture
> marcus@xxxxxxxxxxxxxxxxxx | http://www.synchromedia.co.uk
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux