Hi Michael, You migth want to check out the Raxan PDI (Programmable Document Interface) framework. It works like a charm iwth html snippets: example: $page['body']->appned('<p>שלום</p>'); // this will append the <p> to the html body Here's the link: http://raxanpdi.com For online examples checkout: http://raxanpdi.com/examples.html __ Raymond Irving --- On Mon, 4/13/09, Michael Shadle <mike503@xxxxxxxxx> wrote: > From: Michael Shadle <mike503@xxxxxxxxx> > Subject: Re: Generate XHTML (HTML compatible) Code using DOMDocument > To: "Raymond Irving" <xwisdom@xxxxxxxxx> > Cc: "php-general@xxxxxxxxxxxxx" <php-general@xxxxxxxxxxxxx> > Date: Monday, April 13, 2009, 11:34 AM > I will say though this negates the > reason I chose to use domdocument to begin with. I am > feeding it snippets of HTML that usually do not validate and > I am not sure I want to run it through tidy first to convert > from HTML to XHTML to run the domdocument and then convert > it back... I am essentially using this to traverse the DOM > and process all a href and img src attributes for a link > remapping job. (also realizing the power of php's DOM for > other things I used to try tidy and then use simplexml when > doing HTML scraping ...) but php's dom allows me to give it > absolutely crappy HTML and it still works. > > However if someone has a nice regular expression or chunk > of code that allows you to scan a doc for a href and then > replaces them in the proper context (not just globally) that > would work too. I can't just blindly find urls and then > replace them (although the reason for this escapes me right > now) > > On Apr 13, 2009, at 8:01 AM, Raymond Irving <xwisdom@xxxxxxxxx> > wrote: > > > > > > > Michael, > > > > You are absolutely right! It's loadHTML() that's > causing the problems. > > > > > > Best regards, > > __ > > Raymond Irving > > > > > > --- On Mon, 4/13/09, Michael A. Peters <mpeters@xxxxxxx> > wrote: > > > >> From: Michael A. Peters <mpeters@xxxxxxx> > >> Subject: Re: Generate XHTML (HTML > compatible) Code using DOMDocument > >> To: "Michael Shadle" <mike503@xxxxxxxxx> > >> Cc: "Raymond Irving" <xwisdom@xxxxxxxxx>, > "php-general@xxxxxxxxxxxxx" > <php-general@xxxxxxxxxxxxx> > >> Date: Monday, April 13, 2009, 5:36 AM > >> Michael A. Peters wrote: > >> > >>> > >>> function makeHTML($document) { > >>> $buffer = > $document->saveHTML(); > >>> $output = > >> html_entity_decode($buffer,ENT_QUOTES,"UTF-8"); > >>> return $output; > >>> } > >>> > >>> I'll try it and see what it does. > >>> > >> > >> Huh - not tried above yet - but with > >> > >> $test = > $myxhtml->createElement('p','שלום'); > >> $xmlBody->appendChild($test); > >> > >> both saveXML() and saveHTML() do the right thing. > >> > >> However if I have the string > >> > >> <p>שלום</p> > >> > >> and load it into a DOM - > >> > >> With loadHTML() the utf8 is lost regardless of > whether I > >> use saveXML() or saveHTML() > >> > >> With loadXML() the utf8 is preserved regardless of > whether > >> or not I use saveXML() or saveHTML() > >> > >> php 5.2.9 > >> libxml2 2.6.26-2.1.2.7 (CentOS 5.3) > >> > >> I wonder if the real utf8 problem people > experience is > >> really with loadHTML() and not with saveHTML() ?? > >> > > > > -- > > PHP General Mailing List (http://www.php.net/) > > To unsubscribe, visit: http://www.php.net/unsub.php > > > -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php