Things I've already done that have made a big difference: - modified postgresql.conf shared_buffers value - converted to COPY from individual insert statements - changed BLCKSZ to 32768I currently get ~35k/sec inserts on a table with one index (~70k/sec inserts if I don't have any indexes).
The indexed field is basically a time_t (seconds since the epoch), autovacuum is running (or postgres would stop choosing to use the index). The other fields have relatively lower cardinality.
Each days worth of data gets inserted into its own table so that I can expire the data without too much effort (since drop table is much faster than running a delete and then vacuum).
I would really like to be able to have 1 (or 2) more indexes on the table since it takes a while for a sequential scan of 250million rows to complete, but CPU time goes way up.
In fact, it looks like I'm not currently IO bound, but CPU-bound. I think some sort of lazy-index generation (especially if it could be parallelized to use the other processors/cores that currently sit mostly idle) would be a solution. Is anyone working on something like this? Any other ideas? Where should I look if I want to start to think about creating a new index that would work this way (or am I just crazy)?
Thanks for any insight! -- Daniel J. Luke +========================================================+ | *---------------- dluke@xxxxxxxxxxxx ----------------* | | *-------------- http://www.geeklair.net -------------* | +========================================================+ | Opinions expressed are mine and do not necessarily | | reflect the opinions of my employer. | +========================================================+
Attachment:
PGP.sig
Description: This is a digitally signed message part