Re: CSV speed

Wolf <LoneWolf@xxxxxxxxx> · Mon, 10 Mar 2008 23:14:56 -0400

Danny Brow wrote:
On Mon, 2008-03-10 at 22:36 -0400, Wolf wrote:
Danny Brow wrote:
I have about 10 csv files I need to open to access data. It takes a lot
of time to search each file for the values I need. Would it be best to
just dump all the cvs files to an SQL db and then just grab what I need
from there? I'm starting to think it would make a lot of sense. What do
you guys think?

Thanks,
Dan
Dan,

I can tell you that depending on the size of your files is going to 
dictate the route you want to go.  I have a CSV with 568,000+ lines with 
19 different pieces to each line.  The files are around 180M apiece and 
it takes my server about 2 seconds to run a system grep against the 
files.  I can run a recursive call 7 times against a MySQL database with 
the same information and it takes it about 4 seconds.
IF you have system call ability, a grep wouldn't be bad, otherwise I'd 
suggest loading the csv files into MySQL tables and checking them for 
the information, then dropping the tables when you get the next files. 
You can backup the databases such as a cron job overnight even.
HTH,
Wolf

Thanks that sounds like a good idea. I'm still plugging away with how I
started. I want to know how much faster it will be going with a db. I
was actually thinking of using diff for each updated file to and upload
that to the DB.

Dan

Running a diff and loading the changes wouldn't be a bad way to go, one 
thing to take into account as well would be any tracking of the changes 
that you would need to do as well.  IE, if you update an entry, write it 
to a changes table with a date attached as to when it happened.
Wolf

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php