Re: Parsing a large file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 13, 2006 at 04:47:11PM -0600, Jay Paulson wrote:
> > On Fri, January 13, 2006 3:33 pm, Jay Paulson wrote:
> >> $buf = "";
> > 
> > Probably better to initialize it to an empty array();...
> 
> Yep right.
>  
> >> while (!feof($fhandle)) {
> >>     $buf[] = fgets($fhandle);
> > 
> > ... since you are going to initialize it to an array here anyway.
> > 
> >>     if ($i++ % 10 == 0) {
> > 
> > Buffering 10 lines of text in PHP is probably not going to make a
> > significant difference...
> 
> This is true.  It's what I have written to start with.  Basically I'm just
> trying to make sure that I'm not hogging system memory with a huge file b/c
> there are other apps running at the same time that need system resources as
> well.  That's the main reason why I'm using a buffer to read the file in and
> parse it a little at a time.  By all means test it out on your hardware and
> see what that buffer needs to be.

I'd tend to go with Richard's suggestion. You say you are worried
about resources and memory? well when you load those 10 lines of
code where do they go? memory.

if resource and memory is an issue, there are a couple of options i
would suggest, being that the bottleneck is really disk I/O and cpu
usage.

  1) inside the loop (while reading one line at a time) do a
     usleep(), this will prevent heavy disk access and let the cpu
     catchup with processing

  2) 'nice' the application. run php under nice and give its cpu
     usage a lower priority of cpu processing time.

If you want to test how usleep and the 'nice' thing works here are
some sample scripts to benchmark with:

// cpu usage try with and without nice
  while (1) {}
vs.
  while(1) { usleep(500); }

//diskio, try with and without nice
  $fp = fopen('/var/log/messages', 'r') or die('boo');
  while(1) {
    $line = fgets($fp);
    fseek($fp, 0, SEEK_SET);
  }
vs.
  $fp = fopen('/var/log/messages', 'r') or die('boo');
  while(1) {
    $line = fgets($fp);
    fseek($fp, 0, SEEK_SET);
    usleep(500);
  }

Like Richard said, there are much easier ways to make the app less
resource intensive instead of trying to battle io between memory
and cpu, within php.

Curt.
-- 
cat .signature: No such file or directory

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux