Re: capture a webpage to later process it

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



John Hicks wrote:
J. Alejandro Ceballos Z. -JOAL- wrote:

I want to read the results of an URL address, to later process it and insert part of them as internal code.

If I use include or require, they inserts ALL the resulting code, but I want to do something like:


blah, blah, blah....
<?php
 $result_webpage = somephpfunc('http://other.sit/externalpage.html');
if (eregi("result:([:alnum:]+).*([:alnum:]+\.jpg)",$result_webpage,$array_match)) { echo "<h2>External status:".$array_match[1]."<br>image: <img src=\""..$array_match[2]."\"></h2>"; }
?>
.... blah, blah, blah

If you have fopen wrappers enabled (see http://us2.php.net/manual/en/ref.filesystem.php#ini.allow-url-fopen) then you can simply use file_get_contents() to read the web page into a string. You can then manipulate it with regexes like so:

$Url = 'http://www.php.net';
$ThePageContents = file_get_contents($Url);
$TheNewPageContents = preg_replace('/PHP/', 'Ruby :)', $ThePageContents);
echo $TheNewPageContents;

--J


Here's a more useful use of the same idea:

<?php
if (isset($Url)) {
	$ThePageContents = file_get_contents($Url);
	$TheNewPageContents =
	preg_replace(
		'/(<head[^>]*>)/',
		"\1<base href='$Url/' />",
		$ThePageContents);
	echo $TheNewPageContents;
} else {
	echo "Enter a URL as a query string in this URL, e.g.:
	<br /><a href=\"${_SERVER['PHP_SELF']}?Url=http://www.yahoo.com\"; >
	http://${_SERVER['SERVER_NAME']}${_SERVER['PHP_SELF']}?Url=http://www.yahoo.com</a>";
}
?>

This allows you to run your own rather sloppy proxy. Just plug the url you want into the query string for your page (or, better still, make a form to post it):

https://mydomain.com/mypage.php?Url=http://DomainIWantToView.com/PageIWantToView.html

The regex adds a <base> tag to the remote web page to make the images and links work.

But of course, that means the gets of all the images, css, js, etc. will all show up with your workstation IP on the remote server's log (and on your boss's log of your browsing), so you haven't really accomplished much :(

But it's kind of fun, huh?

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [PHP Users]     [Postgresql Discussion]     [Kernel Newbies]     [Postgresql]     [Yosemite News]

  Powered by Linux