M5 schreef: > OK, I already knew that making it valid doesn't change the result. But > the question remains, how to parse the HTML as it arrives (which I have > no control over anyway), besides doing a str_replace on <br> and > inserting a token, which I later replace (which I shouldn't have to, > right?) for creating valid xHTMl you can run the input through tidy (http://php.net/tidy) does echo $table->saveHTML() show the BR tags? also try: <?php foreach($table->getElementsByTagName("div") as $div) { var_dump($div->hasAttributes(), $div->hasChildNodes()); echo htmlentities($div->C14N()); // undocumented, found in manual, no idea if it works } > > ...Rene > > > On 24-Dec-07, at 7:19 PM, Casey wrote: > >> Actually, never mind. It does not have to be valid to work. >> >> >> >> On Dec 24, 2007, at 6:15 PM, Casey <heavyccasey@xxxxxxxxx> wrote: >> >>> That's because it's not proper XHTML: "<br>" should be "<br />". >>> >>> >>> >>> On Dec 24, 2007, at 6:03 PM, M5 <m5@xxxxxxxxxxxxxxxx> wrote: >>> >>>> Just getting into DOMDocument()... I'm loading an HTML page and >>>> trying to extract certain bits of text. Just one problem: loadHTML() >>>> seems to ignore orphan tags like '<br>'. For example, in the >>>> following HTML: >>>> >>>> <div class="text">Some text is here. <br> New line. <br> Another new >>>> line. </div> >>>> <div class="text">Some text is here. <br> New line. <br> Another new >>>> line. </div> >>>> <div class="text">Some text is here. <br> New line. <br> Another new >>>> line. </div> >>>> >>>> If I run the above HTML through: >>>> >>>> $nodes = $table->getElementsByTagName("*"); >>>> >>>> I only get three nodes that I can iterate through (<div>). What I >>>> want to do is split/explode the three lines within each div, but >>>> when I look at the nodeValue of each node, it only shows something >>>> like "Some text is here. New line. Another new line." >>>> >>>> Any ideas? >>>> >>>> ...Rene >>>> >>>> -- >>>> PHP General Mailing List (http://www.php.net/) >>>> To unsubscribe, visit: http://www.php.net/unsub.php >>>> >> >> -- >> PHP General Mailing List (http://www.php.net/) >> To unsubscribe, visit: http://www.php.net/unsub.php >> > -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php