Hi Adam, Thanks for the update but I'm thinking that it would be much easier if the DOM parser could just ignore the contents of the <script> tags when parsing HTML content. This way we would not have to out JavaScript or force uses to add JavaScript to a separate file. What do you think? __ Raymond Irving On Sun, Jun 6, 2010 at 11:22 PM, Adam Richardson <simpleshot@xxxxxxxxx>wrote: > On Sun, Jun 6, 2010 at 10:39 PM, Raymond Irving <xwisdom@xxxxxxxxx> wrote: > >> Hello, >> >> I'm experiencing another issue when attempting to use >> DOMDocument::loadXML() >> to load the following HTML code: >> >> <?php >> $html = ' >> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" " >> http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> >> <html> >> <body> >> <script type="text/javascript"> >> <!-- >> var i = 0, html = "<strong>Bold Text</strong>,Normal Text"; >> document.write(html); >> i--; // this line causes the parser to fail >> alert(html); >> --> >> </script> >> </body> >> </html>'; >> $dom = new DOMDocument(); >> $dom->loadXML($html); >> echo $dom->saveHTML(); >> ?> >> >> The parser throws the following error when it encounters "i--" in inside >> the >> <script> tag: >> >> Warning: DOMDocument::loadXML() [domdocument.loadxml]: Comment not >> terminated <!-- var i = 0, html = "<strong>Bold Text< in Entity >> >> If I remove the like "i--" it will load the HTML code just fine. >> >> Any ideas as to why this throws an error? >> >> __ >> Raymond >> > > > A comment declaration starts with "<!", and ends with ">", with any number > of comments following the form --comment-- in between: > http://htmlhelp.com/reference/wilbur/misc/comment.html > > You'll see at the bottom of the article that they advocate a simple rule in > comments: > An HTML comment begins with "<!--", ends with "-->" and does not contain " > --" or ">" anywhere in the comment. > > The occurrence of "i--" breaks that rule. > > In your case, if you're maintaining the pages, you can place the javascript > in a separate file or place the javascript in a CDATA section. If you're > parsing pages you don't maintain, you can rip out the javascript before > performing DOM tasks and parse it separately as needed to avoid potential > issues. > > Adam > > -- > Nephtali: PHP web framework that functions beautifully > http://nephtaliproject.com >