Re: Getting even more insert performance (250m+rows/day)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



If you can live with possible database corruption, you could try turning
Fsync off.  For example if you could just reinsert the data on the off
chance a hardware failure corrupts the database, you might get a decent
improvement.

Also have you tried creating the index after you have inserted all your
data?  (Or maybe copy already disables the indexes while inserting?)



> -----Original Message-----
> From: pgsql-performance-owner@xxxxxxxxxxxxxx 
> [mailto:pgsql-performance-owner@xxxxxxxxxxxxxx] On Behalf Of 
> Daniel J. Luke
> Sent: Wednesday, May 24, 2006 2:45 PM
> To: pgsql-performance@xxxxxxxxxxxxxx
> Subject: [PERFORM] Getting even more insert performance 
> (250m+rows/day)
> 
> 
> I have a system that currently inserts ~ 250 million rows per day (I  
> have about 10k more raw data than that, but I'm at the limit of my  
> ability to get useful insert performance out of postgres).
> 
> Things I've already done that have made a big difference:
> - modified postgresql.conf shared_buffers value
> - converted to COPY from individual insert statements
> - changed BLCKSZ to 32768
> 
> I currently get ~35k/sec inserts on a table with one index (~70k/sec  
> inserts if I don't have any indexes).
> 
> The indexed field is basically a time_t (seconds since the epoch),  
> autovacuum is running (or postgres would stop choosing to use the  
> index). The other fields have relatively lower cardinality.
> 
> Each days worth of data gets inserted into its own table so that I  
> can expire the data without too much effort (since drop table 
> is much  
> faster than running a delete and then vacuum).
> 
> I would really like to be able to have 1 (or 2) more indexes on the  
> table since it takes a while for a sequential scan of 
> 250million rows  
> to complete, but CPU time goes way up.
> 
> In fact, it looks like I'm not currently IO bound, but CPU-bound. I  
> think some sort of lazy-index generation (especially if it could be  
> parallelized to use the other processors/cores that currently sit  
> mostly idle) would be a solution. Is anyone working on 
> something like  
> this? Any other ideas? Where should I look if I want to start to  
> think about creating a new index that would work this way (or am I  
> just crazy)?
> 
> Thanks for any insight!
> 
> --
> Daniel J. Luke
> +========================================================+
> | *---------------- dluke@xxxxxxxxxxxx ----------------* |
> | *-------------- http://www.geeklair.net -------------* |
> +========================================================+
> |   Opinions expressed are mine and do not necessarily   |
> |          reflect the opinions of my employer.          |
> +========================================================+
> 
> 
> 



[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux