Re: creating of html-archive

"Richard Lynch" <ceo@xxxxxxxxx> · Fri, 26 Aug 2005 19:16:43 -0500 (CDT)

On Thu, August 25, 2005 1:05 pm, Rory Browne wrote:
> At the risk of making a complete and utter ass of myself, I'm going to
> disagree with Richard.
>
> I'm going to justify this, by the fact that file_get_content function
> is written in C, and performs function required, that is currently
> performed by wget.

I may be reading more into it than was there, but I *THOUGHT* Michelle
was using wget to not just snag one URL, but all the crap (images,
JavaScript, CSS) required to render that URL, but not the links to
external sites/resources.

That URL computation, with absolute and relative, and domain names,
and sub-domains, and which resources are needed, and snarfing all of
the ones you need, but not the other ones, is what I wouldn't try to
re-invent in PHP, when wget does it *SO* well.

If you only need the contents of ONE URL, file_get_contents is a beauty.

If those contents require SSL or Cookie access, http://php.net/curl is
your answer.

If you need, as I believe Michelle does, to suck down all the relevant
junk from a whole page of HTML, I still believe wget +
http://php.net/exec is going to out-perform (not to mention be MUCH
easier to maintain) a PHP solution that does all that.

-- 
Like Music?
http://l-i-e.com/artists.htm

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php