Alan Milnes wrote:
I'm interested in extracting a series of web pages from a Yahoo forum
and storing them in a MySQL database so I can generate things like most
number of posts etc. I've searched on Google but most of the links seem
to be for email harversters!
Anyone have any tips for where to look?
Well, you'd need something like cURL, HTTP_Client or Snoopy to get you
started at fetching files from the web.
Then its a matter of extracting the information (parsing out the tags).
Snoopy has this function that returns the plain text version of a page,
or you can use regular expressions or strip_tags.
From here on out, its just normal PHP tasks of connecting to the
database, running your queries, etc.
Oh, Snoopy is snoopy.sf.net. HTTP_Client is in PEAR. cURL is at
php.net/curl
Regards,
Burhan
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php