RE: developing using the firefox engine

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



hi antoine...

i had already tried tidy with no success... it still seems to still generate
warnings...

 initFile -> tidy ->cleanFile -> perl app (using xpath/livxml)

the xpath/linxml functions in the perl app complain regarding the file. my
thought is that tidy isn't cleaning enough, or that the perl xpath/libxml
functions are too strict!

but the weird thing is that i can use Firefox with the DOM/Xpath plugin, and
I can create an XPath Query that I can use within the Firefox/Plugin to
generate the correct resulting list of items/elements based on the XPath
Query. However, when i then use the same XPath Query, and the same wep page
in my test app, i get the warnings/errors from the perl xpath/libxml
functions....

i'm wondering if there's a way that i can call the Firefox Engine (using the
Plugins, and have it do all the processing/parsing) and let it return the
list of items/elements to me.....

-bruce



-----Original Message-----
From: Antoine [mailto:melser.anton@xxxxxxxxx]
Sent: Saturday, July 01, 2006 12:30 AM
To: bedouglas@xxxxxxxxxxxxx; For users of Fedora Core releases
Subject: Re: developing using the firefox engine


On 01/07/06, bruce <bedouglas@xxxxxxxxxxxxx> wrote:
> hi...
>
> i've been trying (unsuccessfully) to parse/process html files. i'm almost
> certain that the issue has to do with the fact that the html is not valaid
> html.. running the html through various apps "tidy/html validator/etc..."
> complain with warnings.

I have been having a similar problem with html, though this time the
guilty party is mshtml. That piece of dog vomit *can not be made* to
produce xhtml!!! I get the pseudo-html from mshtml, run it through
sgmlreader+converter class and get (x)html out. I can then parse +
process the file with standard xml/xsl tools. It took me an age to
find good things for .net though - you shouldn't have nearly as many
problems on linux/fedora.
I suggest you pass it through tidy and get xhtml out. It may give you
some junk but you don't really have many other options...
Cheers
Antoine

--
This is where I should put some witty comment.

-- 
fedora-list mailing list
fedora-list@xxxxxxxxxx
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
[Index of Archives]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [Fedora Magazine]     [Fedora News]     [Fedora Summer Coding]     [Fedora Laptop]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Maintainers]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Legacy]     [Fedora Desktop]     [Fedora Fonts]     [ATA RAID]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [SSH]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Centos]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Tux]     [Yosemite News]     [Gnome Users]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [Asterisk PBX]     [Fedora Sparc]     [Fedora Universal Network Connector]     [Libvirt Users]     [Fedora ARM]

  Powered by Linux