maybe i should have said: è is not an _xml_ entity. i m not very sure. sorry. `è' is an html entity, represents the letter `è' in iso-8859-1 charset, which have ascii value of 0xe8 . to have it recognized by libxml, there are 3 ways to do this: 1, <?xml version="1.0"><item_name>&#e8;</item_name> 2, <?xml version="1.0" encoding="iso-8859-1"><item_name>è</item_name> 3, <?xml version="1.0"><item_name>è</item_name> 1 can be saved using either utf-8 encoding or iso-8859-1 encoding; 2 must be saved using iso-8859-1 encoding 3 must be saved using utf-8 encoding ( to have `è' be converted properly) in php, we can do this: $html = html_entity_decode('<item_name>farm lettuces with reed avocado, crème fraîche, radish and cilantro</item_name>'); $dom = DomDocument::loadXML("<?xml version=\"1.0\" encoding=\"iso-8859-1\">$html"); On 10/13/05, Marcus Bointon <marcus@xxxxxxxxxxxxxxxxxx> wrote: > On 13 Oct 2005, at 07:24, cc wrote: > > > both `è' and `î' are not entities in charset utf-8, use > > `&egrave;' and `&icirc;' instead. > > I would expect that to result in unconverted entities in the output. > If you're intending to send that content as HTML, then I guess that > would be OK. However, if you're using UTF-8 anyway, why not just use > the real characters? > > Marcus > -- > Marcus Bointon > Synchromedia Limited: Putting you in the picture > marcus@xxxxxxxxxxxxxxxxxx | http://www.synchromedia.co.uk > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > > -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php