I have been pondering whether it would be feasible to work with a 100,000 entry index file, and had put yesterday aside to do some timing tests. I first generated some sample index files of various lengths. Each entry consisted of a single line with the form ASDF;rhubarb, rhubarb, .... where the ASDF is a randomly generated four character index, and the rest of the line is filling, which varies slightly in length and contents from line to line, just in case something tried to get smart and cache the line. The average lenth of the line is about 80 bytes. Then I wrote another program which read the file into an array, using the four character index as the key, and the filling as the contents, sorted the array, and then rewrote it to another file, reporting the elapsed time after each step. My first version used fgets() to read the source file a line at a time, and fwrite() to write the new file. This version performed quite consistently, and took approximately 1.3 seconds to read in a 100,000 entry 7.86Mb file, and another 5 seconds to write it out again. I then read the discussion following fschnittke's post "File write operation slows to a crawl ... " and wondered if the suggestions made there would help. First I used file() to read the entire file into memory, then processed each line into the form required to set up my matrix. This gave a useful improvement for small files, halving the time required to read and process a 10,000 entry 815 kB file, but for a 30,000 entry file it had dropped to about 15%, and it made little difference for a 300,000 entry file. Then I tried writing my whole array into a single horrendous string, and using file_put_contents() to write out the whole string in one bang. I started testing on a short file, and thought I was onto a good thing, as it halved the time to write out a 10,000 entry 800 K file. But as I increased the file size it began to fail dismally. With a 30,000 entry file it was 20% slower, and at 100,000 entries it was three times slower. On Shawn McKenzie's suggestion, I also tried replacing fgets() with stream_get_line(). As I had anticipated any difference was well within below the timing noise level. In conclusion, for short (1MB!) files, using file() to read the whole file into memory is substantially better than using fgets() to read the file a line at a time, but the advantage rapidly diminishes for longer files. Similarly using file_put_contents() in place of fwrite() to write it out again is better for short files (up to perhaps 1 MB) but the performance deteriorates rapidly above this. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php