Re: Parsing HTML

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jay Blanchard wrote:
I need to extract news items from several news sites.
>> ...
Can anybody please give me some pointers?

Can you be more specific here? This is awfully broad.

I'll give an example:

Let's say I want to extract some news-items from the www.CNN.com web page (If you visit CNN's page, you can see the 'MORE NEWS' block at the right side).

I know how to extract the news-items (or any other data in the page) using regular expressions, but I wonder if there are other ways. The code I'm writing will be maintained by other people in the future, and perhaps regular expressions won't be easy for them to update when the site changes its format.

Can somebody please give me a short overview of the different ways of extracting data from HTML?

I hope my question is clear enough now.

TIA.

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux