hi... i've been trying (unsuccessfully) to parse/process html files. i'm almost certain that the issue has to do with the fact that the html is not valaid html.. running the html through various apps "tidy/html validator/etc..." complain with warnings. i'm not sure why the LibXML functions (perl) and tidy complain about the structure of the page. i also can't see a way to get libXML to ignore the warnings. as such, i've been wondering if anybody else has had this issue, and how you managed to resolve this in an automated manner... using firefox, and the XPath plugin with the DOM Inspector, I'm able to traverse the DOM for the web page. I can also create a XPath query that I can use in the XPath window of the firefox plugin to extract/display the correct elements/section of the page... i'm curious as to whether it might be possible to use the firefox engine, coupled with the DOM/XPath plugin functionality to parse the file from a perl/command line app... has anyone ever done anything like this, or heard of anyone who has... is it possible to even programatically call the firefox app... thoughts/comments/etc... -bruce and yeah.. i've also posted to the firefow email list.. but thought it might be useful to post here as well!! -- fedora-list mailing list fedora-list@xxxxxxxxxx To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list