Re: saving outside website content via php...

Shawn McKenzie <nospam@xxxxxxxxxxxxx> · Mon, 02 Jun 2008 09:54:21 -0500

Boyd, Todd M. wrote:
-----Original Message-----
From: blackwater dev [mailto:blackwaterdev@xxxxxxxxx]
Sent: Sunday, June 01, 2008 9:26 PM
To: Shawn McKenzie
Cc: php-general@xxxxxxxxxxxxx
Subject: Re:  saving outside website content via php...

Yes, but file_get_contents will get me the code which I could then 
echo back out to the browser but that wouldn't give me any external 
images, css files or js.

Use the RegEx examples for tag-grabbing that appeared in this mailing
list last week. Parse whatever text is returned from cURL and find any
*.css links, *.js links, *.jpg/gif/png/etc images, and then use cURL
once again to download them and save them.

I'm sorry if you were hoping for some "magic function" that will do all
of that for you, but there is none. There may very well be some
pre-packaged solutions to your problem, but I don't know of any
off-hand.

Seriously, though: think like a programmer! You can get the text, and
the links to the elements you want to save are in the text. Parse it!
Parse it for all you're worth!

Todd Boyd
Web Programmer

That is one way if you're using all PHP.  If you have access to wget 
(most linux), then you can do it in one shell command.

'wget --convert-links -r http://www.example.com/' will get all files 
recursively, save them and convert the links to point to the local files.

Also look at the --mirror option.  Lot's of options/possibilities with 
wget.  There are bound to be some PHP classes that wrap wget somewhere.

-Shawn

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php