Re: Handling (very) large files with PHP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> *Handling (very) large files with PHP*
>
> Hello, I am planning a project in PHP, and I have few unsolved issues that
> I'd like you to help me...
>
> The project will start by loading a file of about 50GB.
> The file has a many objects with a pattern, for example,
>
>
> Name: Joe
> Joe likes to eat
> -------------------
> Name: Daniel
> Daniel likes to ask question on the PHP Mailing List
>
>
> Anyway, so, I am going to convert it into a database, and I insist on using
> PHP for this.
>
> So the questions are,
> How would I open the file? will fopen fread($file, 1024) will work? if then,
> how would I find the seperator, "------------------", without taking too
> many resources?
> I'll have a dedicated server for this project so I could use exec, so I am
> wondering if I should use exec to split the file?
> How many hours or days do you think it will take me to insert all of the
> data, if I have about 8,000,000,000 (8 billion/milliard) entries (objects)?
>
> After I insert all the data, I'll have to start working with it as well -
> for example, having a list of all people and what comes after the word
> "likes" in their entry.
>
> What do you suggest? I am concerened I might not be able to fully acomplish
> both high speed with working (example above) and both high speed when
> watching the data and adding more "works" (as stated above) with PHP. What
> do you think?
> Since inserting to the database, after considering it, will probably be with
> C. But if I wish to work with it - will PHP be good?
>
> What database should I use for so much info?
>
>
> Thanks, Daniel
>

hi Daniel,

Interesting task you got there.
The code below should solve your problem.

$fh = fopen('data.txt', 'r');
$row = array();
while (($line = fgets($fh, 1024)) !== false) {
  if ($line === '-------------------') {
    // save $row into database
    $row = array();
    continue;
  }
  if (strpos($line,'Name: ') !== false) {
    $row['name'] = str_replace('Name: ', '', $line);
    continue;
  }
  $row['likes'] = str_replace($row['name'] . ' likes ', '', $line);
}
fclose($fh);
your database fields should include
id --> autoincrement, primary
name --> varchar(64)
likes --> text

now you should be able to do simple select query that pulls a list of
names with likes not empty.
please clock it for me.
i would like to know how long it takes but the code should be fast.
thanks and good luck.

Virgil
http://www.jampmark.com

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux