On Sat, 2008-04-05 at 19:09 +0100, Steve McGill wrote: > "Richard Heyes" <richardh@xxxxxxxxxxx> wrote in message > news:47F75D2A.3020101@xxxxxxxxxxxxxx > >> Thanks for the heads up on fgetc() incrementing by one. I hadn't actually > >> tested that code yet, I was using the original fseek($handle,$pos). > >> > >> strpos would be ideal but it needs to work on a string and not a file - I > >> don't want to load a 100Mb file into memory if I don't have to. Perhaps I > >> should test how quick the fgets() and ftell() method is because at least > >> it loads in one line at a time. > >> > >> Does anybody know any other ways to go about the problem? > > > > Haven't read the rest of the thread, and so going by the subject alone, > > fgets() finishes when it encounters a newline, so you can use this > > wondrous fact to seek to a specific line: > > > > <?php > > $fp = fopen('filename', 'r'); > > $num = 18; // Desired line number > > > > for ($i=0; $i<$num; $i++) > > $line = fgets($fp); > > > > echo $line; > > ?> > > > > It works because fgets() stops when it encounters a newline (\n). So it's > > just a case of counting the calls to fgets(). > > fgets() would work but as I'm constantly jumping around a 500,000 line file > I thought it was better to maintain a cache of line number positions. > > As a final update to anybody following: > > - Taking away the unnecessary fseek() made the script execute in 63 seconds > - Using a buffer system, (reading in 1Mb of the text file at a time and then > looping through the string in memory) made the script execute in 36 seconds. > Huge improvement, but... > - Porting the code to C++, doing a shell_exec and reading the results back > in to PHP, took less than 2 seconds. > > As fgetc() etc are all effectively C wrappers I was quite surprised at the > speed increase.... It really depends on how you write your code... I ran the following script on a 150 meg text log file containing 1905883 lines in 4 seconds (note that it performs caching). Here's the script: <?php $path = $argv[1]; if( ($fPtr = fopen( $path, 'r' )) === false ) { echo "Couldn't open for reading: $path\n"; exit(); } $line = 1; $lines[$line] = 0; while( fgets( $fPtr ) !== false ) { $lines[++$line] = ftell( $fPtr ); } fclose( $fPtr ); ?> Here's the run times on several iterations (Athlon 2400+): real 0m4.065s user 0m3.488s sys 0m0.464s real 0m4.005s user 0m3.464s sys 0m0.436s real 0m5.816s user 0m3.336s sys 0m0.536s real 0m3.994s user 0m3.384s sys 0m0.504s real 0m4.069s user 0m3.512s sys 0m0.444s real 0m4.009s user 0m3.344s sys 0m0.552s Cheers, Rob. -- http://www.interjinn.com Application and Templating Framework for PHP -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php