Re: creating of html-archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



At the risk of making a complete and utter ass of myself, I'm going to
disagree with Richard.

I'm going to justify this, by the fact that file_get_content function
is written in C, and performs function required, that is currently
performed by wget.

On 8/25/05, Michelle Konzack <linux4michelle@xxxxxxxxxx> wrote:
> Hello,
> 
> Curently I do it with wget and by hand using a bash script,
> but like to integrate it into my php4 webinterface.
> 
> What I need is:
> 
>    1)  INPUT-Form where I can type the URL of
>        a html/php (or something like this) page.

I assume you know the html to create a web form, and how to use the
$_GET and $_POST variables. If not, go learn php, and then read the
rest of my reply.

> 
> when submited,
> 
>    2)  the php script download the page and create an md5sum
Assuming that allow-url-fopen is enabled you can 

$content = file_get_contents($url);
$md5hash = md5($content);



>    3)  look in a database where it check the whole URL wheter
>        it is already there and if
>        YES check the md5sum

What DB are you using?

>            3.1)    if equal drop the URL and stop here
>            3.2)    if different calculate original md5sum
>                    and insert it into database
>        NO  calculate original md5sum and insert it into database
> 
> up to here it is working fine.
> 
>    4)  now get all FULL URIs from the page requisites
> 
> *PAFF*
> 
> How can this be done ?
> 
> Please note, that the files should be renamed to md5-hashes and
> reinseted into the original page. Then safed all files into ONE
> directory with names as md5-hashes.
> 
> Note:   I am talking about (curently) 127.000.000 files.
>        It is curently in a Raid-5 with 7 x 147 GByte but because
>        a major upgrade of Hardware to 15 x 300 GByte the number
>        of files will increase
> 
>        Curently I do not know, whether I should use ONE Raid with
>        15 HDDs, TWO with 7 HDDs, three with 5 HDDs or 5 with 3 HDDs.
> 
>        Maybe I will run into a performance problems with the Inodes
>        which I already have... (I think)
> 
> Greetings
> Michelle
> 
> --
> Linux-User #280138 with the Linux Counter, http://counter.li.org/
> Michelle Konzack   Apt. 917                  ICQ #328449886
>                   50, rue de Soultz         MSM LinuxMichi
> 0033/3/88452356    67100 Strasbourg/France   IRC #Debian (irc.icq.com)
> 
> 
>

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux