Re: time foo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



John R Pierce wrote:
On 12/1/2017 11:32 AM, hw wrote:
So this would mean that the database (running on a different server) takes
almost two times as much as foo --- which I would consider kinda excruciatingly
long because it´s merely inserting rows into two different tables after they were
prepared by foo and then processes some queries to convert the data.

The queries after importing may take like 3 or 5 minutes.  About 4.5 million rows
are being imported.

so you're missing about 25 minutes, and maybe 5 minutes is spent post processing, so thats 20 minutes spent in the data insertion?

Yes, with the 15 minutes actually spent on foo spent on converting
the fields and sending them to the server, which I think is pretty
good.

inserting one row at a time?  or in batches?    remeber a database server is going to do commits after each transaction, which forces the data to be flushed to disk.   4.5 million seperate row transactions, yeah, I could see that taking some time, plus add that many network round trips, etcetc.   if the db server just has a single SATA disk, you're doing 9 million committed writes combined to the two tables?    20 minutes for 9 million inserts, thats 7500 per second.

They are inserted one row at a time, during one transaction
for each of the CSV files.  I´d have to figure out how to
insert them in batches, that might yet be faster.  I could
easily stack up 1000 rows or so and then insert them all at
once, if that´s possible.
_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
https://lists.centos.org/mailman/listinfo/centos




[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]


  Powered by Linux