Ryan S wrote:
$website_data = file_get_contents('dom_test.html');//load the website data,
$dom = new DomDocument; //make a new DOM container in PHP
$dom->loadHTML($website_data); //load all the fetched data into the DOM container
I'm not sure what the answer to your issue is, but mind if I make a
couple off topic recommondations?
1) Use loadXML() instead of loadHTML()
The reason is that loadHTML() will mutilate multibyte utf8 characters,
replacing them with entities.
You can still use $dom->saveHTML() to present the data if html is your
target output.
2) loadXML() is less forgiving of malformed content, but you can fix
that by using tidy to import your data
$website_data = new tidy('dom_test.html',$tidy_config);
$website_data->cleanRepair();
$dom->loadXML($website_data);
where
$tidy_config is the tidy configuration array.
Make sure you set
$tidy_config['output-xhtml'] = true;
so that the output of tidy is clean X(ht)ML for loadXML().
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php