Hello, Curently I do it with wget and by hand using a bash script, but like to integrate it into my php4 webinterface. What I need is: 1) INPUT-Form where I can type the URL of a html/php (or something like this) page. when submited, 2) the php script download the page and create an md5sum 3) look in a database where it check the whole URL wheter it is already there and if YES check the md5sum 3.1) if equal drop the URL and stop here 3.2) if different calculate original md5sum and insert it into database NO calculate original md5sum and insert it into database up to here it is working fine. 4) now get all FULL URIs from the page requisites *PAFF* How can this be done ? Please note, that the files should be renamed to md5-hashes and reinseted into the original page. Then safed all files into ONE directory with names as md5-hashes. Note: I am talking about (curently) 127.000.000 files. It is curently in a Raid-5 with 7 x 147 GByte but because a major upgrade of Hardware to 15 x 300 GByte the number of files will increase Curently I do not know, whether I should use ONE Raid with 15 HDDs, TWO with 7 HDDs, three with 5 HDDs or 5 with 3 HDDs. Maybe I will run into a performance problems with the Inodes which I already have... (I think) Greetings Michelle -- Linux-User #280138 with the Linux Counter, http://counter.li.org/ Michelle Konzack Apt. 917 ICQ #328449886 50, rue de Soultz MSM LinuxMichi 0033/3/88452356 67100 Strasbourg/France IRC #Debian (irc.icq.com)
Attachment:
signature.pgp
Description: Digital signature